I have a very simple sql select statement on a very large table, that is non-normalized. (Not my design at all, I'm just trying to optimize while simultaneously trying to convince the owners of a redesign)
Basically, the statement is like this:
SELECT FirstName, LastName, FullName, State
FROM Activity
WHERE (FirstName=@name OR LastName=@name OR FullName=@name)
AND State=@state;
Now, FirstName, LastName, FullName and State are all indexed as BTrees, but without prefix - the whole column is indexed. State column is a 2 letter state code.
What I'm finding is this:
- When @name = 'John Smith', and @state = '%' the search is really fast and yields results immediately.
- When @name = 'John Smith', and @state = 'FL' the search takes 5 minutes (and usually this means the web service times out...)
- When I remove the FirstName and LastName comparisons, and only use the FullName and State, both cases above work very quickly.
- When I replace FirstName, LastName, FullName, and State searches, but use LIKE for each search, it works fast for @name='John Smith%' and @state='%', but slow for @name='John Smith%' and @state='FL'
- When I search against 'John Sm%' and @state='FL' the search finds results immediately
- When I search against 'John Smi%' and @state='FL' the search takes 5 minutes.
Now, just to reiterate - the table is not normalized. The John Smith appears many many times, as do many other users, because there is no reference to some form of users/people table. I'm not sure how many times a single user may appear, but the table itself has 90 Million records. Again, not my design...
What I'm wondering is - though there are many many problems with this design, what is causing this specific problem.
My guess is that the index trees are just too large that it just takes a very long time traversing the them. (FirstName, LastName, FullName)
Anyway, I appreciate anyone's help with this. Like I said, I'm working on convincing them of a redesign, but in the meantime, if I someone could help me figure out what the exact problem is, that'd be fantastic.
Update
As requested, here are the details after running an Explain:
id: 1, select type: 'SIMPLE', table: 'activity', type: 'ref', possible keys: 'IDX_LastName,IDX_FirstName,IDX_FullName,IDX_State', key: 'IDX_State', key_len: '3', ref: 'const', rows: 7227364, extra: 'Using where'