LRU Cache Implementation

LRU Cache implementation

Difficulty: Hard

Asked in: Amazon, Microsoft, Adobe, Google

Understanding The Problem

Problem Description

Design and implement a data structure for Least Recently Used(LRU) cache.

Problem Note:

The LRU Cache is initialized with a positive capacity
Your data structure must support two operations: get() and put()
get(key) : Finds and returns the value if the key exists in the cache. If the key is not present in the cache, get(key) returns -1
put(key, value) : Inserts new key, if it is not present in the cache. If the cache is filled to capacity, it must remove the least recently used entry.
Try implementing both operations in O(1) time complexity
Input in this problem would be a series of function calls to get() and put()

Example

cache = LRUCache(3)
cache.put(1,1)
cache.put(2,2)
cache.put(1,3)
cache.get(1)    ---> returns 3
cache.put(3,4)
cache.put(4,3)  // removes key 2
cache.get(2)    ---> returns -1

Input Format:

First-line contains N and C , the total number of queries and the cache size.
Each of the following N lines has a query of either type 1(put) or type 2(get).
The query of type 1 is of format: 1 k v , where k is key and v is value
The query of type 2 is of format: 2 k , where k is key whose value is to be fetched.

For example, the input for the above example will be:

Solution

A Least Recently Used (LRU) Cache organizes items in order of use, allowing you to quickly identify which item hasn’t been used for the longest amount of time.

An LRU cache is often implemented by using a doubly-linked list(DLL) with a hash map.

The node structure used for the doubly linked list will be as follows:

class Node { 
    int key
    int value
    Node pre
    Node next
    public Node(int key, int value) { 
        this.key = key
        this.value = value
    } 
}

To implement an LRU, we will move with the straightforward idea of using a doubly-linked list that will represent itself as an LRU cache. Corresponding to each item of the linked list, there will be a key existing in the hashmap.

We make the doubly linked list to always function such that the front node will always be the least recently used item. Now, we will be using two functions:

Get: Get function can be executed in constant time as we have maintained a hash map and thus the cached item can be found in the hashmap using the input key from arguments. If the item not found in the hashmap, simply we can return -1.
Put: As the problem note explains how the LRU cache will work, we will manage a hashmap whose maximum size will be as specified by the user.

If a new item is pushed in the cache which is not already available in the hashmap, we will make a sanity check, if there is overflow or underflow in the capacity of the hashmap. If its overflow, then we will remove the LRU item which will be found at the rear of the doubly linked list and thereby also removing its key from the hashmap to make room for the new item. Insert the new item in the linked list by pushing it in the front and also insert it in the hashmap against the input key.
If a new item is pushed in the cache is already available in the hashmap, then we can bring the item to the front of the doubly linked list, as it will become the least recently used item.

Using a doubly linked list will help us to push and pop the elements from the front and rear of the linked list in constant time and so DLL is preferred.

Solution Steps

Create a hashmap that will store the input item values corresponding to the keys and a doubly-linked list to express LRU.
Look up the item in our hash map for every get operation.

If the item is in the hash table, then it’s already in our cache — this is called a “ cache hit ”
Move the item’s linked list node to the head of the linked list as the item is the most recently used item.

3. If the item fails to be in the hash table on "get" operation, we have a cache miss and so we need to return -1.

4. For every "put" operation load the item into the LRU cache.

Check if the cache is full? If so, we need to evict some item to make room. In our case, that item should be the least recently used item and that item would be available at the tail of the linked list.
Evict that item from the cache by removing it from the linked list and the hash map.
If the cache is not full, create a new linked list node for the item. Insert it at the head of the linked list.
Add the item to our hash map, storing the newly-created linked list node as the value.

Pseudo Code

class LRUCache { 
    HashMap(Integer, Node) map
    int capicity, count
    Node head, tail
    public LRUCache(int capacity) { 
        this.capicity = capacity
        map = new HashMap()
        // create dummy nodes to point head and tail
        head = new Node(0, 0)
        tail = new Node(0, 0)
        head.next = tail; 
        tail.pre = head
        head.pre = null
        tail.next = null
        count = 0
    } 
    void deleteNode(Node node) { 
        node.pre.next = node.next
        node.next.pre = node.pre
    }
    void addNodeToHead(Node node) { 
        node.next = head.next
        node.next.pre = node 
        node.pre = head
        head.next = node
    } 
    int get(int key) { 
        if (key exist in map) { 
            Node node = map.get(key)
            int result = node.value
            deleteNode(node)
            addNodeToHead(node)
            return result
        } 
        return -1
    }
    void put(int key, int value) {
        if (key exist in map) { 
            Node node = map.get(key)
            node.value = value
            deleteNode(node)
            addNodeToHead(node) 
        } 
        else { 
            Node node = new Node(key, value)
            map.put(key, node)
            if (count < capicity) { 
                count = count + 1
                addNodeToHead(node)
            }
            else { 
                map.remove(tail.pre.key)
                deleteNode(tail.pre)
                addNodeToHead(node)
            } 
        } 
    } 
}

Complexity Analysis

Time Complexity: O(1)

Put method: O(1)
Get method: O(1)

Space Complexity: O(1) (How?)

Critical Ideas To Think

Why did we use the Doubly linked list?
How did we add a new node in DLL while checking if that node already existed in the DLL or not in the put function using constant time?
How we managed to remove the least recently used item while making the room for a new item?
Explain the time complexities of the two methods.
What are the main applications of using an LRU cache mechanism?