Problem
Merge \(k\) sorted linked lists and return it as one sorted list. Analyze and describe its complexity.
Example:
Input:
[
1->4->5,
1->3->4,
2->6
]
Output: 1->1->2->3->4->4->5->6
Input
lists
- an array of linked list pointers
# Definition for singly-linked list.
# class ListNode:
# def __init__(self, x):
# self.val = x
# self.next = None
Approach
Normally when we think about merge sort, we usually only consider two linked lists. Given that the two linked lists are sorted, merging two linked lists is quite inexpensive -- we can achieve this in linear time complexity.
When we have \(k\) linked lists, it's difficult to follow the above approach. Not only that, but it becomes difficult to keep track of the next linked list to traverse.
The most intuitive approach is to use pairs that houses two values: \((value, index)\), where the value would be the number value of the linked list node, and the index is the integer value that represents which index this linked list resides in, inside lists
.
We now need a data structure that can effectively keep track of the smallest values. Since there are \(k\) linked lists, we can use a binary heap to keep track of \(k\) nodes, since we only need to keep track of at most \(k\) nodes with distinct indices. (It does not make sense to have two nodes with the same linked list index, since ideally we only want to keep track of the minimum value so far from each distinct linked list.
Having \(k\) nodes in our binary heap means that deletion will cost \(log(k)\), and since we have \(n\) nodes total, our merging process will cost \(n
log(k)\). We take up only \(k\) extra space, which is a very good benefit of using binary heaps.
Solution
def mergeKLists(self, lists: 'List[ListNode]') -> 'ListNode':
k = len(lists)
dummy = head = ListNode(None)
heap = []
if not lists:
return None
# Ensure that all linked list (indices) are inserted into the heap
for i in range(0, k):
if lists[i] is None:
continue
heapq.heappush(heap, (lists[i].val, i))
lists[i] = lists[i].next
i = 0
# The heap may be depleted if every linked list is exhausted except for one.
# In this case, we still have to look at the non-empty linked list.
# Since every linked list index is added into the heap, this guarantees that
# our `i` value will point to any linked list that remains to be explored
while heap or lists[i]:
# If this linked list is empty now, pop from the heap
# to get the next linked list to look at
if lists[i] is None:
pair = heapq.heappop(heap)
head.next = ListNode(pair[0])
head = head.next
i = pair[1] # the index value
continue
# Beyond this point, our linked list is non-empty
# Refill the heap up to k size as long as this linked list is non-empty
if len(heap) < k:
heapq.heappush(heap, (lists[i].val, i))
lists[i] = lists[i].next
# Heap is full, so push and pop elements to keep the heap full
else:
pair = heapq.heappushpop(heap, (lists[i].val, i))
head.next = ListNode(pair[0])
lists[i] = lists[i].next
i = pair[1] # the index value
head = head.next
return dummy.next