- What is a trie:You've probably already seen kinds of trees that store things more efficiently, such as a binary search tree. Here, we will examine another variant of a tree, called a trie.
Aside: The name trie comes from its use for retrieval. It is pronounced like "try" by some, like "tree" (the pronunciation of "trie" in "retrieval") by others. Here, we will discuss a particular implementation of a trie, which may be somewhat different than how it is described elsewhere.
We use a trie to store pieces of data that have a key (used to identify the data) and possibly a value (which holds any additional data associated with the key).
Here, we will use data whose keys are strings.
Suppose we want to store a bunch of name/age pairs for a set of people (we'll consider names to be a single string here).
Here are some pairs:
amy 56 ann 15 emma 30 rob 27 roger 52
Let's start off with amy. We'll build a tree with each character in her name in a separate node. There will also be one node under the last character in her name (i.e., under y). In this final node, we'll put the nul character (\0
) to represent the end of the name. This last node is also a good place to store the age for amy.
. <- level 0 (root) | a <- level 1 | m <- level 2 | y <- level 3 | \0 56 <- level 4
Now, when we go to add ann, we do the same thing; however, we already have stored the letter a at level 1, so we don't need to store it again, we just reuse that node with a as the first character. Under a (at level 1), however, there is only a second character of m...But, since ann has a second character of n, we'll have to add a new branch for the rest of ann, giving:
. | a / \ m n | | y n | | \0 56 \0 15
Note: Again, ann's data (an age of 15) is stored in her last node.
Now, let's add emma. Remember e is the first character and should go at level 1. Since there is no node with character e at level 1, we'll have to add it. In addition, we'll have to add nodes for all the other characters of emma under the e. The first m will be a child of the e, the next m will be below the first m, etc., giving:
. / \ a e / \ | m n m | | | y n m | | | \0 56 \0 15 a | \0 30
. / | \ a e r / \ | | m n m o | | | / \ y n m b g | | | | | \0 56 \0 15 a \0 27 e | | \0 30 r | \0 52
To better understand how a trie works, answer the following questions.
- What would the trie look like if we now added anne with age 67? How about ro with age 23?
- Would the trie look different if we added the names in a different order, say: rob, ann, emma, roger, amy?
- Is this a binary tree, tertiary tree or what? In other words, each node has at most how many children?
- Trie operations:Here are the operations that we will concern ourselves with for this trie. You may need others for a particular use of the trie.
Add:
We've already given examples of adding.
IsMember:
See if data with a certain string key is in the trie.
For example, should report a true value and andIsMember(trie, "amy")
should report a false value.IsMember(trie, "anna")
We can imagine other variations where we do something with the value (like return it) once we find something with the matching key.
Remove:
Remove something from the trie, given its key.
We may want more operations depending on how we'll use the trie.
Since our trie holds data with string keys, which of the operations need a key and value, and which just need keys?
- IsMember algorithm:Remember that a trie is a special kind of tree. Since a trie organizes its data via the keys (as specified above), it is easy to find whether a particular key is present.
Finding a key can be done with iteration (looping).
Here is an outline of such an algorithm. It looks in a particular trie and determines whether data with a particular string key is present.
IsMember(trie, key)
[iterative]
1. Search top level for node that matches first character in key 2. If none, return false Else, 3. If the matched character is
\0
? return true Else, 4. Move to subtrie that matched this character 5. Advance to next character in key* 6. Go to step 1
* I.e., the new search key becomes the old one without its first character.
The algorithm moves down the tree (to a subtree) at step 6. Thus, the top level in step 1 actually may refer to any level in the tree depending on what subtree the algorithm is currently at.
- Trie implementation:Now, let's think about how to actually implement a trie of name/age pairs in C.
As usual, we'll put the data structure in its own module by producing the source filestrie.h
andtrie.c
.
The functions needed for our trie are the operations we mentioned:
TrieAdd() TrieIsMember() TrieRemove()
TrieCreate() TrieDestroy()
Friday, October 14, 2016
TRIES Data Structure
Labels:
algorithms,
Data Structure,
Heap,
Stack,
Tries
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment