Auto-suggestion implementation using Trie Data Structure.

Trie data structure has many useful applications, one of the important applications of the trie is auto-suggestion when user inputs using the keyboard. Let us assume we have a dictionary that contains “APPLE”, “APE”, “GOD”, “GOOD”, “GUN”. So, when a user enters “GO” then our auto-suggestion suggests “GOD”, “GOOD” as a suggested list of words. Let us assume the below picture which represents the trie contains the above 5 words.

Trie, generated after adding “APE”, “APPLE”, “GOD”, “GOOD” and “GUN”.

Auto-suggestions are the list of words formed based on a given substring. For example, if the user passes “GO” as input then its auto-suggestion from the above dictionary gives [“GOD”, “GOOD”] as result.

In this article, we will discuss how our algorithm works for the input “GO” on above given trie.

  • Initially, trie searches for the given substring in the dictionary (“TRIE”). This returns a Node, where the substring ends as showing below.
After finding substring, function returns Node, which contains “O” position.

Code to find substring in a dictionary is as following.

TrieNode *_findSubString(string subStr)
{
    TrieNode *currNode = root;
    for (int i = 0; i < subStr.length(); i++)
    {
        char val = subStr.at(i);
        if (currNode->children[val - 'A'] == NULL)
        {
            return NULL;
        }
        currNode = currNode->children[val - 'A'];
    }
    return currNode;
}
  • As a second step, the algorithm will find all strings followed by the input substring and returns the list of those strings. The code for that is as follows.
void getDictionaryWords(TrieNode *root, string subStr, list<string> &result)
{
    TrieNode *curr = root;
    if (curr == NULL)
        return;
    if (curr->isEnd)
    {
        result.push_back(subStr);
    }

    for (int i = 0; i < MAX_CHARS; i++)
    {
        string str = subStr + string(1, (char)('A' + i));
        if (curr->children[i] != NULL)
        {
            getDictionaryWords(curr->children[i], str, result);
        }
    }
}

When we call with our “GO” as parameter 2 to the above function, this will return us as a list of strings which are [“GOD,” “GOOD”].

The complete algorithm is as following.

// Output
$./a.out
GOD
GOOD

The complexity of the above algorithm is the O(W*K), where ‘W’ is the word length and ‘K’ is the number of words following by substring.

Advertisements