Reference: LeetCode
Difficulty: Hard
Problem
Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value.
For example,
[2,3,4]
, the median is3
[2,3]
, the median is(2 + 3) / 2 = 2.5
Design a data structure that supports the following two operations:
void addNum(int num)
- Add a integer number from the data stream to the data structure.double findMedian()
- Return the median of all elements so far.
Example:
1 | addNum(1) |
Follow up: (Reference: Solutions to follow-ups)
- If all integer numbers from the stream are between 0 and 100, how would you optimize it?
- We can maintain an integer array of length 100 to store the count of each number along with a total count. Then, we can iterate over the array to find the middle value to get our median. Time and space complexity would be
O(100) = O(1)
.
- If 99% of all integer numbers from the stream are between 0 and 100, how would you optimize it?
- In this case, we can keep a counter for
lessThanHundred
andgreaterThanHundred
. As we know the solution will be definitely in 0-100 we don’t need to know those number which are >100 or <0, only count of them will be enough.
Analysis
Reference: Solution Post
Simple Sorting
Every time we add in a new value, we need to re-sort the array.
1 | private List<Integer> list = new ArrayList<>(); |
Time: addNum()
takes $O(N\log{N})$ 462 ms
Space: $O(N)$
Search + Insert (Binary Search)
Linear Search:
When a new number comes, we have to add it to the list while maintaining the sorted nature of the list. This is achieved easily by finding the correct place to insert the incoming number.
1 | private List<Integer> list = new ArrayList<>(); |
Time: $O(N)$ 265 ms
Space: $O(N)$
Pop quiz Can we use binary search
to improve efficiency? Sure!
Since the list is always sorted, we can use binary search.
Binary Search: (improvement)
Once the position is found, we need to shift all higher elements by one space to make room for the number, which takes $O(N)$.
Note: Here we use lower bound
binary search since we need to find the position where the value is >= num
.
1 | private List<Integer> list = new ArrayList<>(); |
Time: $O(N)$ 76 ms
- Binary search takes $O(\log{N})$ time.
- Insertion can take up to $O(N)$ time.
Space: $O(N)$
Two Heaps
Idea: We only need a consistent way to access the median elements. Keeping the entire input sorted is not a requirement
.
As it turns out there are two data structures for the job: heaps
and self-balancing binary search trees
.
Our goal is to maintain two heaps (balancing) in the following way:
- A
max-heap
to store the smaller half of the input numbers. - A
min-heap
to store the larger half of the input numbers. - Constraint: The max-heap is allowed to store, at most,
one more element
than the min-heap does.
Steps: Add the new number to max-heap
. Since the max-heap
received a new element, we must do a balancing step for the min-heap
.
- Size Check (maxPQ > minPQ + 1): If the constraint is not satisfied, then remove the largest element from the
max-heap
and offer it to themin-heap
. - Value Check (maxPQ.peek() > minPQ.peek()): Check the max value and min value from two heaps where
maxVal <= minVal
; otherwise, remove the largest element from themax-heap
and offer it to themin-heap
. - Size Check (maxPQ < minPQ): If the
min-heap
then holds one more elements than themax-heap
after the previous operation. We fix that by removing the smallest element from themin-heap
and offering it to themax-heap
.
1 | private PriorityQueue<Integer> maxPQ; |
Time: $O(5\log{N}) = O(\log{N})$
Space: $O(N)$
Multiset and Two Pointers
Reference: LeetCode Solution
It is based on self-balancing trees.