169. Majority Element

Reference: LeetCode
Difficulty: Easy

Here is a nice article about Majority Voting Algorithm by Greg Grothaus.

Problem

Given an array of size $n$, find the majority element. The majority element is the element that appears more than ⌊n/2⌋ times.

Note: You may assume that the array is non-empty and the majority element always exist in the array.

Example:

Input: [3,2,3]
Output: 3

Input: [2,2,1,1,1,2,2]
Output: 2

Follow up: Jesus. Many possible solutions.

Analysis

Brute-Force

public int majorityElement(int[] nums) {
  int n = nums.length;
  for (int curr : nums) {
    int count = 0;
    for (int val : nums) {
      if (curr == val) {
        count += 1;
      }
    }
    if (count > n / 2) {
      return curr;
    }
  }
  return -1;
}

Time: $O(N^2)$
Space: $O(1)$

HashMap

It reduces the runtime to $O(N)$ with some extra space $O(N)$.

// Use Hash Map
public int majorityElement(int[] nums) {
  Map<Integer, Integer> map = new HashMap<>();
  for (int val : nums) {
    int count = map.getOrDefault(val, 0);
    map.put(val, count + 1);
    // check
    if (count > nums.length / 2) {
      return val;
    }
  }
  return -1;
}

Time: $O(N)$
Space: $O(N)$

Sorting

If the elements are sorted, the majority element is guaranteed to be at right-leaning or left-leaning mid. Think of two extreme cases when the majority element starts at the beginning or reaches the tail.

Source: LeetCode

// sort first
public int majorityElement(int[] nums) {
  // Length / 2  =>  right-leaning mid
  // even:  0  1  2  3
  //              ^        mid = 4 / 2 = 2
  //  odd:  0  1  2  3  4
  //              ^        mid = 5 / 2 = 2
  Arrays.sort(nums);
  return nums[nums.length / 2];
  // actually you need to confirm the result
}

Time: $O(N\log{N})$
Space: $O(1)$ (in-place) or $O(N)$ (non-destructive)

Randomization

Because more than ⌊n/2⌋ array indices are occupied by the majority element, a random array index is likely to contain the majority element.

Note: The commented lines are not allowed (unreachable statement), since the compiler knows that the condition of the loop is true.

// randomization
public int majorityElement(int[] nums) {
  int n = nums.length;
  Random rand = new Random();
  while (true) {
    // pick a candidate
    int idx = rand.nextInt(n);
    int candidate = nums[idx];
    // check the count
    int count = 0;
    for (int i = 0; i < n; ++i) {
      if (nums[i] == candidate) count += 1;
    }
    // check if it is a majority element
    if (count > n / 2) {
      return candidate;
    }
  }
  // throw new IllegalArgumentException();
  // return -1;
}

Time: $O(\infty)$
Space: $O(1)$

It is theoretically possible for this algorithm to r un indefinitely, so the worst possible runtime is unbounded. But in fact the expected runtime is far better which is $O(N)$.

Divide and Conquer

Idea: If we know the majority element in the left and right halves of an array, we can determine which is the global majority element in $O(N)$.

The base case of this problem is when the slice is length-1, and the majority element for this slice is just the only element.
If the current slice is longer than length-1, we must combine the results from the left slice and right slice.
- If they agree on the majority element, then this element is the global majority element.
- If they disagree, we need to count the occurrences of the left and right majority elements to get the answer since we know that only one of them can be right.

public int majorityElement(int[] nums) {
  int n = nums.length;
  return majorityElement(nums, 0, n - 1);
}

private int majorityElement(int[] nums, int lo, int hi) {
  if (lo == hi) {
    return nums[lo];
  }
  int mid = lo + (hi - lo) / 2; // left-leaning
  int left = majorityElement(nums, lo, mid);
  int right = majorityElement(nums, mid + 1, hi);

  if (left == right) { // agree on
    return left;
  }

  // disagree on
  int leftCount = countInRange(nums, left, lo, hi); // O(N)
  int rightCount = countInRange(nums, right, lo, hi); // O(N)

  return (leftCount > rightCount) ? left : right;
}

private int countInRange(int[] nums, int val, int lo, int hi) {
  int count = 0;
  for (int i = 0; i <= hi; ++i) {
    if (nums[i] == val) {
      count += 1;
    }
  }
  return count;
}

Time: $T(N) = 2T(N/2) + 2N = O(N\log{N})$
Space: $O(\log{N})$

Boyer-Moore Voting Algorithm

The idea is to count instances of the majority element as $+1$ and instances of any other element as $-1$, and then summing them would make the majority element obvious.

We have a count variable initialized with $0$. We increment it by one whenever we see an instance of the candidate, and decrement it by one whenever we see something else.
When count equals $0$, we effectively forget about everything before the current index. Also, we consider the element at the next index as a new candidate of the majority element.
Eventually, a suffix will be found for which count does not hit 0.

Note: It is still possible that there is no majority element at all (e.g. 1 4 1 1 3 5 9 8 7, fake candidate is 7). So actually we need a second pass to confirm the result.

Example 1: (n = 16, #7 = 10, #5 = 5, #1 = 1)

1 2	[7, 7, 5, 7, 5, 1 \| 5, 7 \| 5, 5, 7, 7 \| 7, 7, 7, 7] 1 2 1 2 1 0 1 0 -1 -2 -1 0 1 2 3 4

Example 2: (n = 16, #7 = 6, #5 = 9, #1 = 1)

1 2	[7, 7, 5, 7, 5, 1 \| 5, 7 \| 5, 5, 7, 7 \| 5, 5, 5, 5] 1 2 1 2 1 0 1 0 1 2 1 0 1 2 3 4

Note: Check if count == 0 before updating it.

public int majorityElement(int[] nums) {
  int count = 0;
  Integer candidate = null;
  for (int val : nums) {
    if (count = 0) {
      candidate = val;
    }
    count += (val == candidate) ? +1 : -1;
  }
  // possible verification
  return count;
}

Time: $O(N)$
Space: $O(1)$