Give an $O(n lg k)$ time algorithm to merge $k$ sorted lists into one sorted list, where $n$ is the total number of elements in all the input lists. (Hint: Use a min-heap for $k$-way merging.)
As there are $k$ sorted lists and total of $n$ values, let us assume each list contains $frac{n}{k}$ numbers, moreover each of the lists are sorted in strictly ascending order, and the results will also be stored in the ascending order. My pseudo-code looks like this —
list[k] ; k sorted lists heap[k] ; an auxiliary array to hold the min-heap result[n] ; array to store the sorted list for i := 1 to k ; O(k) do heap[i] := GET-MIN(list[i]) ; pick the first element ; and keeps track of the current index - O(1) done BUILD-MIN-HEAP(heap) ; build the min-heap - O(k) for i := 1 to n do array[i] := EXTRACT-MIN(heap) ; store the min - O(logk) nextMin := GET-MIN(list[1]) ; get the next element from the list 1 - O(1) ; find the minimum value from the top of k lists - O(k) for j := 2 to k do if GET-MIN(list[j]) < nextMin nextMin := GET-MIN(list[j]) done ; insert the next minimum into the heap - O(logk) MIN-HEAP-INSERT(heap, nextMin) done
My overall complexity becomes $O(k) + O(k) + O(n(k + 2 lg k)) approx O(nk+n lg k) approx O(nk)$. I could not find any way to avoid the $O(k)$ loop inside the $O(n)$ loop to find the next minimum element from k lists. Is there any other way around? How to get an $O(n lg k)$ algorithm?
Asked By : ramgorur
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/12853
Answered By : Dukeling
lists[k][?] // input lists c = 0 // index in result result[n] // output heap[k] // stores index and applicable list and uses list value for comparison // if i is the index and k is the list // it has functions - insert(i, k) and deleteMin() which returns i,k // the reason we use the index and the list, rather than just the value // is so that we can get the successor of any value // populate the initial heap for i = 1:k // runs O(k) times heap.insert(0, k) // O(log k) // keep doing this - delete the minimum, insert the next value from that list into the heap while !heap.empty() // runs O(n) times i,k = heap.deleteMin(); // O(log k) result[c++] = lists[k][i] i++ if (i < lists[k].length) // insert only if not end-of-list heap.insert(i, k) // O(log k)
The total time complexity is thus $O(k * log k + n * 2 log k) = O(n log k)$ You can also, instead of deleteMin and insert, have a getMin ($O(1)$) and an incrementIndex ($O(log k)$), which will reduce the constant factor, but not the complexity. Example:
(using value rather than index and list index and heap represented as a sorted array for clarity)
Input: [1, 10, 15], [4, 5, 6], [7, 8, 9] Initial heap: [1, 4, 7] Delete 1, insert 10 Result: [1] Heap: [4, 7, 10] Delete 4, insert 5 Result: [1, 4] Heap: [5, 7, 10] Delete 5, insert 6 Result: [1, 4, 5] Heap: [6, 7, 10] Delete 6, insert nothing Result: [1, 4, 5, 6] Heap: [7, 10] Delete 7, insert 8 Result: [1, 4, 5, 6, 7] Heap: [8, 10] Delete 8, insert 9 Result: [1, 4, 5, 6, 7, 8] Heap: [9, 10] Delete 9, insert nothing Result: [1, 4, 5, 6, 7, 8, 9] Heap: [10] Delete 10, insert 15 Result: [1, 4, 5, 6, 7, 8, 9, 10] Heap: [15] Delete 15, insert nothing Result: [1, 4, 5, 6, 7, 8, 9, 10, 15] Heap: [] Done