Find the most frequent element in an array

Level: Medium

Asked in: Facebook, Uber

Understanding the Problem

Problem Description: Given an array A[] of size n, find the most frequent element in the array, i.e. the element which occurs the most number of times. It is assured that at least one element is repeated.

For example:

Input: A[] = {3, 9, 1, 3, 6, 3, 8, 1, 6}

Output: 3

Input: A[] = {1, 9, 1, 3, 2, 3, 10}

Output: 1

Possible questions to ask the interviewer:-

What should I return if two or more elements have the maximum frequency? ( Ans: Return whichever element is the smallest)
Can negative numbers exist in the array? ( Ans: Yes, they can)

Brute force and Efficient Solutions

We will discuss three possible solutions for this problem:-

Brute Force: Calculate frequency using nested loops
Using Sorting: Calculate frequency linearly after sorting
Using Hash Table : Use a Hash Table to find the frequency of elements

1. Brute Force Approach

For each element, scan the entire array to find its duplicates. Maintain two variables max_freq and ans to store the maximum frequency and the element with that frequency respectively.

Pseudo-Code

int findMostFrequentElement(int A[], int n)
{
    int max_freq = 0
    int ans = -1
    for (i = 0 to n-1)
    {
        int curr_freq = 1
        for (j = i+1 to n-1)
            if (A[j] == A[i])
                curr_freq = curr_freq + 1
        
        if (max_freq < curr_freq)
        {
            max_freq = curr_freq
            ans = A[i]
        }
       else if (max_freq == curr_freq)
            ans = min(ans, A[i])
    }
   return ans
}

Complexity Analysis

Time Complexity: O(n²) ( Why? )

Space Complexity: O(1)

Critical ideas to think!

According to this algorithm, the frequency is calculated again upon encountering the second occurrence of an element. How should we optimize this such that frequency is calculated only once for each unique element?
How can we improve the time complexity further?

2. Using Sorting

If we sort the array, all the duplicate elements will get lined up next to each other. We can now linearly find the frequency of all elements in the array. This approach also ensures that frequency is calculated only once for each unique element.

Pseudo-Code

int findMostFrequentElement(int A[], int n)
{
    sort(A, n)
    int max_freq = 0
    int ans = -1
    int i = 0
    while (i < n)
    {
        int curr_freq = 1
        while (i+1 < n && A[i+1] == A[i])
        {
            curr_freq = curr_freq + 1
            i = i + 1
        }
        
        if (max_freq < curr_freq)
        {
            max_freq = curr_freq
            ans = A[i]
        }
        else if (max_freq == curr_freq)
            ans = min(ans, A[i])
        i = i + 1
    }
    return ans
}

Complexity Analysis

Time Complexity: Sorting the array + Linear Traversal of array

= O(nlogn) + O(n) = O(n)

Space Complexity: O(n), if we use merge sort and O(1) if we use heap sort

Critical ideas to think!

If the frequency of the current element is equal to the maximum element, is it necessary to check if the current element is smaller than ans ?
How could you use the concept of time-space tradeoff to your benefit and find the frequency of all elements faster?

3. Using Hash Table

We can create a hash table and store elements and their frequency counts as key value pairs.

Solution Steps

1. Create a Hash Table to store frequency of each element in the given array. Consider elements in the array as key and their frequency as value
2. First scan the array one by one and check if value associated with any key (as that particular element) exist in the Hash Table or not

If exist, increment the value of that key by 1
If not, store the value as 1

3. During the iteration, we are also storing the most frequent element and its frequency in the parameter ans and max_freq respectively.
4. Update ans and max_freq if any value (frequency for that element) greater than the stored frequency is found.
5. And finally, return the most frequent element found i.e. return ans

Pseudo-Code

int findMostFrequentElement(int A[], int n)
{
    Create a HashTable H
    int max_freq = 1
    int ans = -1
    for (i = 1 to n-1)
    {
        if (A[i] in H)
        {
            H[A[i]] = H[A[i]] + 1
            if (max_freq < H[A[i]])
            {
                max_freq = H[A[i]]
                ans = A[i]
            }
            else if (max_freq == H[A[i]])
                ans = min(ans, A[i])
        }
        else
            H[A[i]] = 1
    }
    return ans
}

Complexity Analysis

Time Complexity: O(n) ( Why? )

Space Complexity: O(n), for storing the Hash Table

Critical ideas to think!

Is using a sophisticated data structure like BST useful here? Analyze the complexity in that case.
What if all elements were unique, i.e, the maximum frequency is 1, will this code return output as expected? What changes will you make?