[Solved]: Is this probability distribution data structure already discovered?

Problem Detail: Problem formulation I have implemented an interesting probability distribution data structure: having $n$ elements $a_1, a_2, dots, a_n$ in the data structure with respective positive weights $w_1, w_2, dots, w_n > 0$, upon request each $a_i$ may be returned with probability $$frac{w_i}{sum_{j = 1}^n w_j}.$$ A Java implementation of my data structure can be obtained here, yet I will specify the algorithm here textually: There are two types of nodes, relay nodes and leaf nodes. The actual elements with their actual weights are stored in leaf nodes. What comes to relay nodes, we allow only those that have both children (the tree is binary). However, each relay node know how many leaves is there in the subtree starting from that very relay node; also each relay node knows the sum of weights of all its leaves. Whenever adding a new element, we start traversing the tree from its root, and until we reach a leaf node, we choose that relay node, whose node count is smallest; this way it maintains a logarithmic height. What comes to the removal of an element, the data structure maintains a hashtable mapping each element to its leaf node. So, first we get the leaf node, remove it, after which, we replace the parent relay node with the sibling leaf. Whenever sampling an element, we start from root, and choose an appropriate relay node until we reach the appropriate leaf node, whose element we return. All three basic operations seem to run in worst-case $mathcal{O}(log n)$, and my benchmarking supports the observation. A simple visualization: Question Is this data structure already discovered? If yes, could you tell me the name of the paper that discusses it?

Asked By : coderodde

Answered By : D.W.

This is a straightforward use of augmented data structures. I wouldn’t expect it to be publishable on its own; it’s more of an undergraduate exercise. So, I would expect that it’s probably been thought of many times before by many people independently, but not necessarily written down in any particular research paper. The idea of storing in the node of a tree the number of leaves under it is a completely standard idea in augmented data structures. So too is the idea of storing each node the sum of the values in the leaves under it. If you read some algorithms and data structures textbooks that describe augmented data structures, I’m sure you’ll find examples of both of those motifs.
Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/59690  Ask a Question  Download Related Notes/Documents

Leave a Reply