c++ - Prefix search in a radix tree/patricia trie -
I am currently implementing a Radix Tree / Patricia Tree (which you want to call it). I want to use it in a dictionary on a weak piece of hardware for prefix searches. It has to do more or less work like auto-completion.
My implementation is based, but does not include prefix detection in the code, though the author says:
[...] Say that you calculate all the nodes Who have a common prefix "AB" You can do the first search from the depth starting with that root, whenever you come back to the edges, you stop. But I do not think how it should work. For example, if I create a radius tree with these words:
disease
fantasy
fantasy / imagery
imagery
instant
instant
In funny onlineI will get the exact same "best match" for the prefix "i" and "in" so that I find it hard to gather all matching words Is crossing the tree to that best match.
In addition, there is an implemented prefix search in it, this code clearly checks for all the nodes (starting with a certain node) for a prefix match - it actually compares bytes .
Can anyone give me a detailed description of how to apply prefix detection on Radix trees? What is the only way to use algorithm in Java implementation?
Think about what your trie encodes. On each node, you have a path that takes you to that node, so in your example, you start with Lambda; (This capital is lambda, this Greek font type is useless) corresponding root node for the root string. And lambda;
- and lambda; & Rarr; "I" / ii> On the "i" node, there are two children, one is one for "m" and "n". The next letter is "n", so you take it,
- and lambda; And RRR; "I" and RRR; "N"
And the word "i", "n" only starts in your data set is in "in", from "n" There is no child. This is a match.
Now, suppose that instead of being "set", the data set was "infantibulum". (I am writing SF which has been left as an exercise.) You can still get the "n" node in the same way, but then if you have the next letter "q" you get, you know The word that does not appear is set in your data at all, because there is no "q" branch at that point, you say "well, no mail." (Maybe you might start adding words depending on the application.)
But if the next letter is "f", then you can go. You can do short circuits with a little craft, though: Once you reach a node that represents a unique path, you can hang that node the whole string . When you go to that node, you know that the rest of the string should be "probibool", so you used the prefix to match the whole string, and returned it.
How do you use it? In many non-UNIX command interpreters, like the old Wax DCL, you can use any unique prefix of a command. Therefore, LS (1) was equal to DIRECTORY , but no other command started with DIR, so that you can type DIR and that Also good as a whole word if you do not remember the correct command, then you can just type 'D', and hit (I think) ESC; DCL CLI will return the commands you started with all D , which can find it very fast.
Comments
Post a Comment