Use a trie data structure to store words. Every node contains a list of all letters (pointers to the same node structure) and flags for each letter to indicate the length of the word. Write a method to insert into this kind of data structure. What would you use to store each node?
Rationale : Unlike other data structures, in a trie keys are not identified by information from a single node, but by the path from the root of the tree to a particular node. Thus, each node has a number of children equal to the letters in the alphabet being used, and each edge is labeled with the corresponding letter of the alphabet.The following picture shows a trie storing the strings (words): cup, sing, curve, singer, small, ball:
Before we get to the insertion algorithm, let’s start by analyzing how a representation of this type of data structure would look like:
{
int noOfWordsThatEndHere, noOfChildren;
trie *letters[26];
trie()
{
noOfWordsThatEndHere = noOfChildren = 0;
memset(letters,0,sizeof(letters));
}
};
As you can see, a trie is a recursive data structure that stores 26 other tries within its internal representation (one for each letter in the alphabet). Some other information we are storing here is the number of words that end at the current trie and the number of nonNULL children. Insertion can be easily performed recursively. Given a string, we just increase noOfWordsThatEndHere if we reach the end of a word or, otherwise, create a new trie at the position corresponding to the next letter in the word. Insertion in a trie can be done in O(n), where n is the length of the string inserted.
Before solving this problem do the following (not limited to):
 Clarify doubts
 Write/loudly speak/analyze single or multiple pseudocode(e.g. trivial algorithm to advanced algorithm/Data Structures) which can lead to solution
 Explain any assumptions or limitations on the written pseudocode
 Mention major data structures to be used
 Also how the positive/negative functional tests prove that the resultant solution is correct.
Clarification Points
Clarification Point1:
What is the maximum size of the string given as input?
Clarification Point2:
What kind of characters can the given string contain?
Pseudo Code
Pseudo code:
Insert(t, s): If (current position in s is end of word) t.number_of_words++ return if (subtrie for current letter is NULL) create subtrie increase t.number of children insert (t.subtrie, s+1)
Assumptions/Limitations
Assumption/Limitation:
 We will assume the given string to be inserted into the trie data structure is no longer than 200 characters.
 We will also assume the given string contains only lower case and upper case letters and the words are separated by ‘ ‘ or a newline.
Major Data Structures
Major data structures:
Array of characters (string), trie
Unit tests:  
Postive Functional Test cases  
Input  Expected Output 
S = any string with lower case / upper case letters and spaces  Program terminates 
Negative Functional Test Cases  
S = any other string  Undefined 
Here is the sample answer that implements the solution described above
#include <cstdio> #include <string.h> using namespace std; char s[200]; struct trie { int noOfWordsThatEndHere, noOfChildren; trie *letters[26]; trie() { noOfWordsThatEndHere = noOfChildren = 0; memset(letters,0,sizeof(letters)); } }; void insert(trie *t,char *s) { for (int stringIndex = 0; stringIndex < strlen(s); stringIndex++) if (s[stringIndex] != ' ' && !(s[stringIndex] >= 'a' && s[stringIndex] <= 'z') && !(s[stringIndex] >= 'A' && s[stringIndex] <= 'Z')) { printf("Invalid input!\n"); return; } if(*s == '\n'  *s == ' '  strlen(s) == 1) { t>noOfWordsThatEndHere++; return; } int position = *s'a'; if(t > letters[position]==0) { t > letters[position]=new trie; t > noOfChildren ++; } insert(t > letters[position],s+1); } int main() { trie *t=new trie; strcpy(s, "test sentence for trie"); insert(t, s); return 0; }
Output and Further Excercise
Further exercise:  

Q: Write a function that searches for a word in a trie. What is its complexity? A: We would just need to start at the root and search for each subsequent letter, going along the path corresponding to the word. The complexity is O(N), where N is the length of the word we are searching for. 