Lets say I have a FirstName > MiddleName > LastName hierarchy (~10k rows, for sake of the question). This means you could have "John > Mary-Anne > Eddy" or "Eddy > John > Jacob" row. The point being that the hierarchy makes little sense and is very foreign to the user (unlike, say, a Country > State > City structure).
Because its so unstructured and confused, I want to provide the user with an auto complete input box. As they type, it should search for possible substring matches, and when they "root" their search string at a level, it will then restrict the results to below that level.
Now, because there are lots of people named "John", it makes little sense that if they type "John" they only get back results like
- John > Allen > Alexander
- John > Allen > Burschawitz
- John > Allen ... repeat 100 times ...
Because they'll never see the unique row "Jason > John > Smith".
Instead, they should get back something like ("*" is just an arbitrary indicator to the user of "hey, lots more rows below this exist"):
- John > Allen > *
- Jason > John > Smith
- Mike > John > *
- Mary > Elena > Johnason
If they type "John > Al" then the results would be limited to anything under "John >", but should be grouped similarly to above.
I hope the explanation is clear. The requirements are a bit loose. Just reasonable ones so that a person can search through the tree and find what they are after.
Right now, I have some interesting SQL that looks for the search term in the row, figures out its position, does some substring'ing, group bys, and order by's to get the above results, but its not performing well enough.
I'm trying to solve this problem on a typical LAMP stack (except with Oracle). Its not shared hosting, so I do have full control over the server. The data changes small amounts every few weeks, and the search results can stay stale for a reasonable amount of time (e.g, a cron that updates the search index isn't out of the question).