Problem Detail: I have a list and want to select a random item from the list. An algorithm which is said to be random:
When you see the first item in the list, you set it as the selected item. When you see the second item, you pick a random integer in the range [1,2]. If it’s 1, then the new item becomes the selected item. Otherwise you skip that item. For each item you see, you increase the count, and with probability 1/count, you select it. So at the 101st item, you pick a random integer in the range [1,101]. If it’s 100, that item is the new selected node.
Is it really uniformly random? My thoughts are: As the number of nodes increases, the probability for them being selected decreases, so the best chance of selection is for items 1, 2, 3, …, not for 20, 21, …, 101. Each node will not have equal probability of being selected. Please clarify, as I have trouble understanding this.
The algorithm works, but to understand why, you need to know basic probability theory. The idea is to prove by induction that at step $t$, the currently selected algorithm is uniform among the first $t$ elements. This is clearly the case when $t=1$. Assume now the induction hypothesis for time $t$, and consider what happens at time $t+1$. With probability $1/(t+1)$, the element at position $t+1$ is selected. With probability $t/(t+1)$, the previously selected element is retained. By assumption, this is a uniformly random element among the first $t$ ones, and so the probability that any given element out of the first $t$ is selected is $t/(t+1) cdot 1/t = 1/(t+1)$. In total, we see that each of the first $t+1$ elements is selected with the uniform probability $1/(t+1)$. If you don’t like induction, you can also do it all at once. For item $i$ to be selected as the output of the algorithm, it needs to have been selected at time $i$, and to not have been dropped at any further time. If there’s a total of $n$ elements then this happens with probability $$ frac{1}{i} cdot frac{i}{i+1} cdot frac{i+1}{i+2} cdots frac{n-2}{n-1} cdot frac{n-1}{n} = frac{1}{n}. $$
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/23038
Related