Heap – Give an $O(n lg k)$ time algorithm to merge $k$ sorted lists into one sorted list

Question Detail: Most probably, this question is asked before. It’s from CLRS (2nd Ed) problem 6.5-8 —

Give an $O(n lg k)$ time algorithm to merge $k$ sorted lists into one sorted list, where $n$ is the total number of elements in all the input lists. (Hint: Use a min-heap for $k$-way merging.)

As there are $k$ sorted lists and total of $n$ values, let us assume each list contains $frac{n}{k}$ numbers, moreover each of the lists are sorted in strictly ascending order, and the results will also be stored in the ascending order. My pseudo-code looks like this —

    list[k]   ; k sorted lists     heap[k]   ; an auxiliary array to hold the min-heap     result[n] ; array to store the sorted list     for i := 1 to k                 ; O(k)     do         heap[i] := GET-MIN(list[i]) ; pick the first element                                      ; and keeps track of the current index - O(1)     done     BUILD-MIN-HEAP(heap) ; build the min-heap - O(k)     for i := 1 to n     do         array[i] := EXTRACT-MIN(heap)   ; store the min - O(logk)         nextMin := GET-MIN(list[1])     ; get the next element from the list 1 - O(1)         ; find the minimum value from the top of k lists - O(k)         for j := 2 to k                          do             if GET-MIN(list[j]) < nextMin                 nextMin := GET-MIN(list[j])          done         ; insert the next minimum into the heap - O(logk)         MIN-HEAP-INSERT(heap, nextMin)     done 

My overall complexity becomes $O(k) + O(k) + O(n(k + 2 lg k)) approx O(nk+n lg k) approx O(nk)$. I could not find any way to avoid the $O(k)$ loop inside the $O(n)$ loop to find the next minimum element from k lists. Is there any other way around? How to get an $O(n lg k)$ algorithm?

Asked By : ramgorur
Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/12853

Answered By : Dukeling

The purpose of the heap is to give you the minimum, so I’m not sure what the purpose of this for-loop is – for j := 2 to k. My take on the pseudo-code:

lists[k][?]      // input lists c = 0            // index in result result[n]        // output heap[k]          // stores index and applicable list and uses list value for comparison                  // if i is the index and k is the list                  //   it has functions - insert(i, k) and deleteMin() which returns i,k                  // the reason we use the index and the list, rather than just the value                  //   is so that we can get the successor of any value  // populate the initial heap for i = 1:k                   // runs O(k) times   heap.insert(0, k)           // O(log k)  // keep doing this - delete the minimum, insert the next value from that list into the heap while !heap.empty()           // runs O(n) times   i,k = heap.deleteMin();     // O(log k)   result[c++] = lists[k][i]   i++   if (i < lists[k].length)    // insert only if not end-of-list     heap.insert(i, k)         // O(log k) 

The total time complexity is thus $O(k * log k + n * 2 log k) = O(n log k)$ You can also, instead of deleteMin and insert, have a getMin ($O(1)$) and an incrementIndex ($O(log k)$), which will reduce the constant factor, but not the complexity. Example:
(using value rather than index and list index and heap represented as a sorted array for clarity)

Input: [1, 10, 15], [4, 5, 6], [7, 8, 9]  Initial heap: [1, 4, 7]  Delete 1, insert 10 Result: [1] Heap: [4, 7, 10]  Delete 4, insert 5 Result: [1, 4] Heap: [5, 7, 10]  Delete 5, insert 6 Result: [1, 4, 5] Heap: [6, 7, 10]  Delete 6, insert nothing Result: [1, 4, 5, 6] Heap: [7, 10]  Delete 7, insert 8 Result: [1, 4, 5, 6, 7] Heap: [8, 10]  Delete 8, insert 9 Result: [1, 4, 5, 6, 7, 8] Heap: [9, 10]  Delete 9, insert nothing Result: [1, 4, 5, 6, 7, 8, 9] Heap: [10]  Delete 10, insert 15 Result: [1, 4, 5, 6, 7, 8, 9, 10] Heap: [15]  Delete 15, insert nothing Result: [1, 4, 5, 6, 7, 8, 9, 10, 15] Heap: []  Done 

Leave a Reply