## When random people give money to random other people

A post on Decision Science about a problem of Uri Wilensky‘s has been making the rounds:

Imagine a room full of 100 people with 100 dollars each. With every tick of the clock, every person with money gives a dollar to one randomly chosen other person. After some time progresses, how will the money be distributed?

People often expect the distribution to be close to uniform.  But this isn’t right; the simulations in the post show clearly that inequality of wealth rapidly appears and then persists (though each individual person bobs up and down from rich to poor.)  What’s going on?  Why would this utterly fair and random process generate winners and losers?

Here’s one way to think about it.  The possible states of the system are the sets of nonnegative integers (m_1, .. m_100) summing to 10,000; if you like, the lattice points inside a simplex.  (From now on, let’s write N for 100 because who cares if it’s 100?)

The process is a random walk on a graph G, whose vertices are these states and where two vertices are connected if you can get from one to the other by taking a dollar from one person and giving it to another.  We are asking:  when you run the random walk for a long time, where are you on this graph?  Well, we know what the stationary distribution for random walk on an undirected graph is; it gives each vertex a probability proportional to its degree.  On a regular graph, you get uniform distribution.

Our state graph G isn’t regular, but it almost is; most nodes have degree N, where by “most” I mean “about 1-1/e”; since the number of states is

$N^2 + N - 1 \choose N-1$

and, of these, the ones with degree N are exactly those in which nobody’s out of money; if each person has a dollar, the number of ways to distribute the remaining N^2 – N dollars is

$N^2 - 1 \choose N-1$

and so the proportion of states where someone’s out of money is about

$\frac{(N^2 - 1)^N}{(N^2 + N - 1)^N} \sim (1-1/N)^N \sim 1/e$.

So, apart from those states where somebody’s broke, in the long run every possible state is equally likely;  we are just as likely to see $9,901 in one person’s hands and everybody else with$1 as we are to see exact equidistribution again.

What is a random lattice point in this simplex like?  Good question!  An argument just like the one above shows that the probability nobody goes below \$c is on order e^-c, at least when c is small relative to N; in other words, it’s highly likely that somebody’s very nearly out of money.

If X is the maximal amount of money held by any player, what’s the distribution of X?  I didn’t immediately see how to figure this out.  You might consider the continuous version, where you pick a point at random from the real simplex

$(x_1, .. x_N) \in \mathbf{R}^N: \sum x_i = N^2$.

Equivalently; break a stick at N-1 randomly chosen points; what is the length of the longest piece?  This is a well-studied problem; the mean size of the longest piece is about N log N.  So I guess I think maybe that’s the expected value of the net worth of the richest player?

But it’s not obvious to me whether you can safely approximate the finite problem by its continuous limit (which corresponds to the case where we keep the number of players at N but reduce the step size so that each player can give each other a cent, or a picocent, or whatever.)

What happens if you give each of the N players just one dollar?  Now the uniformity really breaks down, because it’s incredibly unlikely that nobody’s broke.  The probability distribution on the set of (m_1, .. m_N) summing to N assigns each vector a probability proportional to the size of its support (i.e. the number of m_i that are nonzero.)  That must be a well-known distribution, right?  What does the corresponding distribution on partitions of N look like?

Update:  Kenny Easwaran points out that this is basically the same computation physicists do when they compute the Boltzmann distribution, which was new to me.

Tagged , ,

## The stochastic sandpile

At last week’s AIM conference on chip-firing I learned about a cool stochastic process which seems very simple, but whose behavior is apparently very hard to prove anything about!

It goes like this.  You start with a graph (say a finite connected graph, for simplicity) and your states are functions from the vertices of the graph to the natural numbers, which you think of as specifying the height of a pile of chips (or sand grains, if you want to keep up that metaphor) sitting on top of that vertex.  At least one of the vertices is a sink, which means that any chip landing there disappears forever.

Now the nature of the process is that you never want to have more than one chip at a vertex.  So every time there are at least two chips at a vertex v, you can pop two of them off; each chip flips a coin and decides whether to topple left or right.  And you keep on doing this until there’s at most one chip at each spot.  (You might start with more chips than vertices, but the sink helps you slough off the surplus.)  This process was apparently first studied by Manna and is sometimes called the Manna sandpile.

More math below the fold!

## Efficient markets and the digits of pi

John Quiggin had a good post yesterday on Crooked Timber about the various flavors of the Efficient Market Hypothesis, which according to Quiggin is a piece of shuffling zombie social science that won’t die no matter how many flaming sticks you jam through its skull.

The weakest form of the EMH is the so-called “random walk hypothesis” that the future behavior of a stock price is independent from its prior behavior.  If that’s the case, no amount of staring at charts is going to help you beat the market.  The random walk hypothesis is pretty well-supported by the data we have; a really nice popular account is Burton Malkiel’s A Random Walk Down Wall Street, one of the mathiest bestsellers I know of.  It makes a great present for anyone in need of a good rationale for not paying attention to their investments.

Quiggin writes

The success of the random walk hypothesis showed that the existence of predictable price patterns in markets with rational and well-informed traders was logically self-contradictory.

But this doesn’t seem quite right.  The EMF, even in its weakest form, holds that the current price of a stock is a best estimate arrived at by an aggregate of profit-maximizing investors with knowledge of the stock’s previous price:  if there were a reliable way to use previous prices to determine that tomorrow’s price would be Y, then the investors would figure that out, and today’s going price would be Y as well.

But the random walk hypothesis is much weaker.  It makes no claim about the etiology of stock prices.  It’s compatible with EMF, but it’s compatible with plenty of other models too — for instance, the one in which stock prices really are a random walk, where the price at time t+1 is the price at time t plus, let’s say, a normally distributed random variable X.  Such a market would be as impossible to beat as a roulette wheel — but evidently not because stock prices are best estimates of the future price of the stock, or of the underlying value of the company, or of anything at all.  (Fellow mathematician David Speyer makes a similar point in the comments at CT.)

You can dress this up a bit:  suppose price(t+1) – price(t) is the random variable X + [(.001) x t’th digit of pi] – (.045).   Unless something very weird is going on with the digits of pi, this version of the stock market would also satisfy the random walk hypothesis.  But unlike the pure random walk it’s a market where you can make some money; if I’m lucky enough to have access to the “digits of pi” rule, I can make a small average profit.  So the random walk hypothesis can hold for markets that are neither efficient nor unbeatable.

Stronger versions of the EMH hold that the market price already takes into account, not only the previous prices of the stock, but also publicly available information about the stock.  So what would happen if I let my secret investing strategy slip out?  This isn’t a rhetorical question; I’m authentically curious about what the rational-investor model would say about a market in which prices are publically revealed to have been governed, up until now, by some completely deterministic but “random-looking” sequence like the digits of pi.