Longest Consecutive Sequence - Interview Problem

Difficulty: Hard
Asked in: Amazon, Google

Understanding the problem

Problem Description: Given an unsorted array A[] consisting of n integers, you need to find the length of the longest consecutive sequence of integers in the array.

Example 1

Input: A[] = [10, 4, 20, 1, 2, 8, 9, 3, 19]
Output: 4 
Explanation: The longest consecutive sequence of integers in the array is 1,2,3 and 4

Example 2

Input: A[] = [0, -2, 3, -1, 2, 1]
Output: 6
Explanation: The longest consecutive sequence of integers in the array is -2,-1,0,1,2, and 3.

Possible follow-up questions to ask the interviewer :

  1. Can the array be modified in any way? (Ans: Yes)
  2. Does the sequence found necessarily be increasing or decreasing in nature? (Ans: No, the sequence can be in any order but the elements must be consecutive integers when sorted)
  3. Are the array elements necessarily positive? (Ans: No, they can be positive, negative, or zero)
  4. Can the array contain duplicates? Ans: Sure, that's a possibility.

Brute Force and efficient solutions

We will be discussing three ways to solve the problem:-

  1. Brute Force: Using nested loops to search the next element
  2. Sorting and checking the longest streak of consecutive elements
  3. Using a Hash Table

1. Brute Force : Using nested loops

For each element in A[], linearly search for consecutive elements greater and lesser than the current element. Keep a track of current streak and the longest streak of consecutive elements in the array.

Solution Steps
  1. Declare and initialize longest_streak variable to 0.
  2. Linearly traverse the array.
  3. Declare a curr_streak variable that stores the longest streak that can be made with the current element as a part of it.
  4. Search for consecutive elements smaller than the current element and increase current streak accordingly.
  5. Do the same for consecutive elements greater than the current element.
  6. Update longest_streak if curr_streak is greater.
Pseudo-Code
int longestConsecutiveSequence(int A[], int n)
{
    int longest_streak = 0
    for( i = 0 to n-1 )
    {
        int curr_num = A[i]-1
        int curr_streak = 1

        while( linear_search(A,n,curr_num) == True )
        {
            curr_streak = curr_streak + 1
            curr_num = curr_num - 1
        }
        
        curr_num = A[i]+1
        while( linear_search(A,n,curr_num) == True )
        {
            curr_streak = curr_streak + 1
            curr_num = curr_num + 1
        }
        longest_streak = max(longest_streak, curr_streak)
    }
   return longest_streak
}
Complexity Analysis

The inner loops linearly search for consecutive elements and the worst-case time complexity for linear search is O(n). This search will be done as long as the streak will be and the longest a streak can be is the size of the array, i.e., n. Therefore, inside the outer loop, the worst-case time complexity is O(n²).

Also, this process is repeated for each element of the array. So time Complexity = n * O(n²) = O(n³)

Space Complexity: O(1)

Critical ideas to think!
  • What if we use a visited array to track elements included in any streak so that they are not computed again? How will this affect the complexity?
  • Can we improve the time complexity of searching for nearby elements?

2. Using Sorting

Sort the entire array A[]. Now, the consecutive elements will be linearly lined-up next to each other. Linearly check for the longest consecutive sequence now.

Pseudo-Code
int longestConsecutiveSequence(int A[], int n)
{
    int longest_streak = 0
    sort(A, n)
    i = 0
    while(i < n)
    {
        int curr_streak = 1
        while(i < n and A[i+1] == A[i] + 1)
        {
            curr_streak = curr_streak + 1
            i = i + 1
        }
        
        if ( curr_streak == 1 )
            i = i + 1
        
        longest_streak = max(longest_streak, curr_streak)
    }
    
    return longest_streak
}
Complexity Analysis

Time Complexity: Sorting an array + Linear Traversal of the array (Even though nested loops exist, both loops use same loop variable which is increased linearly) = O(nlogn) + O(n) = O(nlogn)

Space Complexity: If we use Merge Sort, O(n), else if we use Heap Sort, O(1)

Critical ideas to think!
  • Why is the loop variable ‘i’ incremented when curr_streak is equal to 1 outside the while loop?
  • If a number is traversed while searching for streak for some previous smaller number, is it revisited ever again in this algorithm?
  • Why is the condition i<n checked again in the inner while loop when its already being checked in the outer while loop?
  • Why is curr_streak initialized to 1 and not 0?
  • is this algorithm works fine if elements are repeated?

3. Using a Hash Table

We can improve the time complexity of searching consecutive elements by using a hash table which can check the presence of consecutive elements in O(1) average.

Solution Steps
  1. Declare longest_streak and initialize to 0.
  2. Create a Hash Table of size n and add elements of the array in it.
  3. Linearly iterate over the array and check if A[i] -1 exists
  • If it does not exist, then it is the first element of its corresponding sequence. Check for consecutive numbers greater than A[i] in the hash table and keep on increasing curr_streak.
  • If it does, then it is not the first element of its corresponding sequence. So, we can move to the next iteration.

4. Update longest_streak if curr_streak is bigger than it.

5. Return longest_streak.

Pseudo-Code
int longestConsecutiveSequence(int A[], int n)
{
    int longest_streak = 0
    Create HashTable H of size n
    for( i = 0 to n-1 )
        H.add(A[i])
    for( i = 0 to n-1 )
    {
       // This checks if the current element is the first
       // element of a sequence
        if( H.search(A[i]-1) == False )
        {
            int curr_streak = 1
            int curr_num = A[i]+1
            while( H.search(A, curr_num) == True )
            {
               curr_streak = curr_streak + 1
               curr_num = curr_num + 1
            }
            longest_streak = max(longest_streak, curr_streak)
        }
    }
   return longest_streak
}
Complexity Analysis

Since the time complexity of the search operation is now O(1), we will just need to check how many times that operation will be performed. You can see that each element is searched at most two times. First, in the if condition and second, in the while loop condition. So the complexity due to this is linear ( O(n) ) in nature.

Since each element is visited only once, Time Complexity: O(n)

Space Complexity: O(n), for the Hash Table

Critical ideas to think!
  • Does the algorithm need to be tweaked if we include negative numbers in the array?
  • What other data structures would be time-efficient for this particular problem?
  • Since its an optimization problem, can this be solved using dynamic programming?
  • Is there any other way to solve this problem? Think.

Comparison of different solutions

Suggested Problems to solve

  • Find the longest increasing consecutive subsequence
  • Sort the numbers in an array based on their frequency
  • Given an array of integers, and a number K, print all pairs in the array whose sum is equal to K.
  • Find the largest increasing sequence of consecutive positive integers.

Happy Coding! Enjoy Algorithms!

AfterAcademy Data Structure And Algorithms Online Course - Admissions Open