Find the intersection of two unsorted arrays - Interview Problem

Difficulty Level : Medium
Asked in : Google, Facebook

Understanding the Problem

Problem Description: Given two integer arrays A[] and B[] of size m and n, respectively. We need to find the intersection of these two arrays. The intersection of two arrays is a list of distinct numbers which are present in both the arrays. The numbers in the intersection can be in any order.

For example

Input: A[] = {1,4,3,2,5, 8,9} , B[] = {6,3,2,7,5}

Output: {3,2,5}

Input : A[] = {3,4,6,7,10, 12, 5}, B[] = {7,11,15, 18}

Output: {7}

Possible follow-up questions to ask the interviewer:-

  1. Can you modify the arrays? (Ans: Yes)
  2. What are the constraints on the length and elements of the array? (Ans: m>n, 1 <= m,n <= 10^5, -10^5 <= A[i], B[i] <= 10^5)
  3. Can there be repeated elements in the array or between the arrays? (Ans: Yes, the possibility exists)

Designing efficient solutions

We are discussing four ways to solve this problem :

  1. Brute force: Use nested loops
  2. Sorting and binary search
  3. Sorting and two-pointer approach
  4. Using a Hash Table

1. Brute Force: Use nested loops

For each element in A[], perform a linear search on B[]. Keep on adding the common elements to a list. To make sure the numbers in the list are unique, either check before adding new numbers to list or remove duplicates from the list after adding all numbers.

Pseudo-Code
int[] findIntersection(int A[], int B[], int m, int n)
{
    int ans[]
 
    for ( i = 0 to m-1 )
    {
        for ( j = 0 to n-1 )
        {
           if ( A[i] == B[j] )
              if ( A[i] not in ans)
                  ans.add(A[i])
        }
    }
 
    return ans
}
Complexity Analysis

There are two loops in the solution where the outer loop runs n times and the inner loop runs m times. So in the worst case, time complexity = O(m*n).

Space Complexity: worst-case = O(1) (Why? Some space is being consumed by answer list. Why isn't it taken into consideration?)

Critical ideas to think!
  • What would be the time and space complexity of removing duplicates from n size array?
  • How can we improve the time complexity?

2. Sorting and binary search

In the brute force approach, we select elements from array A[] and linearly search for them in array B[]. We could increase the efficiency of searching the elements if we sort B[] and apply binary search to search for the elements. Binary Search takes O(logn) time complexity.

Solution Steps
  1. Declare answer list.
  2. Sort B[] array in increasing order.
  3. Search for each element in A[] in the sorted array B[]
  4. If the element if found, add it to the answer list
  5. Return answer list
Pseudo-Code
bool binary_search(int B[], int n, int K)
{
   int low = 0, high = n

   while( low <= high )
   {
      int mid = low + ( high - low ) / 2

      if( B[mid] == K )
          return True
 
      else if( B[mid] > K )
          high = mid - 1
 
      else
          low = mid + 1
   }

   return false
}


int[] findIntersection(int A[], int B[], int m, int n)
{
   int ans[]
   sort(B,n)

   for(i = 0 to m-1)
   {
      if(binary_search(B, n, A[i]))
          ans.add(A[i])
   }

   return ans
}
Complexity Analysis

Time Complexity: Sorting array B[] + Searching for all elements of A[] in B[] = O(nlogn) + O(mlogn) = O(nlogn + mlogn)

Space Complexity: If we use merge sort, then space complexity is O(n), else if we use heap sort, space complexity is O(1).

Critical ideas to think!
  • Here, we sorted the smaller array. What if we sort the bigger array for the solution? How does the time complexity changes?
  • What would be the time and space complexity if we use quicksort? Compare different sorting algorithms
  • What would be the space complexity if we use recursive binary search for the implementation?
  • Prepare a list of problems where you can apply the idea of sorting and binary search.

3. Sorting and two-pointer approach

If both A[] and B[] are sorted, we can use a two-pointer approach. At each step, we compare one element of A[] with one element of B[]. If both are equal, we add it to the list, else we move forward.

Solution Steps
  1. Declare the answer list.
  2. Sort both the arrays in increasing order.
  3. Initialize the variables i and j in the sorted array A[] and B[] respectively. i = 0 and j = 0
  4. Run a loop while i < m and j < n (Why 'and', not 'or'?)
  • If A[i] == B[j], add A[i] to the list and increment both i and j
  • If A[i] < B[j], increase i by 1
  • If A[i] > B[j], increase j by 1

5. Return answer list.

Solution Visualization
Pseudo-Code
int[] findIntersection(int A[], int B[], int m, int n)
{
    int ans[]
    sort(A, m)
    sort(B, n)
    i = 0, j = 0
    while( i < m and j < n )
    {
        if ( A[i] == B[j] and A[i] not in ans)
          { 
            ans.add(A[i])
            i = i + 1
            j = j + 1
          }    
 
        else if ( A[i] < B[j] )
            i = i + 1
 
        else
            j = j + 1
    }
 
    return ans
}
Complexity Analysis

Time Complexity : Sorting array A[] + Sorting array B[] + Using two pointer approach = O(mlogm) + O(nlogn) + O(n+m) = O(mlogm + nlogn)

Space Complexity: If mergeSort is used, then O(m), else in heapSort, O(1)

Critical ideas to think!
  • Why are we incrementing pointers when A[i]>B[j] or A[i]<B[j] ?
  • Does this algorithm work fine when elements are repeated? How can you ensure that elements in the answer list are unique?
  • The time complexity of while loop is O(m+n) - Why?
  • Why the idea of the two pointer works perfectly for a sorted array?
  • Prepare a list of problems where you can apply the idea of two pointer approach for the solution.

4. Using a Hash Table

We will make a hash table of the smaller array so that our searching becomes faster and all elements appear only once. Now for each element of the bigger array, we will perform a search operation of that element in the hash table. The search operation in a hash table is, in an average case, O(1).

Solution Steps
  1. Create a hash table of size n
  2. Add all elements of array B[] in the hash table
  3. For each element of A[], search for that element in the hash table. If found, add the element to the answer list
  4. Return answer list
Solution Visualization
Pseudo-Code
int findIntersection(int A[], int B[], int m, int n)
{
    int ans[]
    Take HashTable H of Size n
    for(i = 0 to n-1)
    {
        H.insert(B[i])
    }
 
    for( i = 0 to m-1)
    {
        if(H.Search(A[i]) is true)
           ans.add(A[i])
    }
    return ans
} 
Complexity Analysis

Time Complexity = Inserting n elements of A[] in hash table + Time complexity of searching m elements of B[] in the hash table

= m* O(1) + n * O(1)) = O(m+n)

Space Complexity = O(n), for storing the auxiliary hash table

Critical ideas to think!
  • Do we need to modify the algorithm if elements are repeated? Is there a possibility of elements being repeated in the answer list?
  • What are the changes in time complexity and space complexity if the hash table is created for the bigger array instead of the smaller array? Which is more efficient?
  • What is the worst-case time complexity for searching an element in a hash table?
  • What other data structures could help efficiently solve this problem? Is self-balancing BST useful?

Comparison of different solutions

Suggested Problems to solve

  • Find the intersection of 3 sorted arrays
  • Find the union of two sorted arrays
  • Find the intersection of two linked list
  • Merge two sorted arrays
  • Given an array of integers, and a number K, print all pairs in the array whose sum is equal to K.
  • Given an unsorted array of integers, find the length of the longest consecutive sequence

Happy Coding! Enjoy Algorithms!

AfterAcademy Data Structure And Algorithms Online Course - Admissions Open