views:

232

answers:

3

Hello,

First time posting here, but an avid reader. I am experiancing slow query times on my database (all tested locally thus far) and not sure how to go about it. The database itself has 44 tables and some of them tables have over 1 Million records (mainly the movies, actresses and actors tables). The table is made via JMDB using the flat files on IMDB. Also the SQL query that I am about to show is from that said program (that too experiances very slow search times). I have tried to include as much information as I can, such as the explain plan etc.

"QUERY PLAN"
"HashAggregate (cost=46492.52..46493.50 rows=98 width=46)"
" Output: public.movies.title, public.movies.movieid, public.movies.year"
" -> Append (cost=39094.17..46491.79 rows=98 width=46)"
" -> HashAggregate (cost=39094.17..39094.87 rows=70 width=46)"
" Output: public.movies.title, public.movies.movieid, public.movies.year"
" -> Seq Scan on movies (cost=0.00..39093.65 rows=70 width=46)"
" Output: public.movies.title, public.movies.movieid, public.movies.year"
" Filter: (((title)::text ~~* '%Babe%'::text) AND ((title)::text !~~* '""%}'::text))"
" -> Nested Loop (cost=0.00..7395.94 rows=28 width=46)"
" Output: public.movies.title, public.movies.movieid, public.movies.year"
" -> Seq Scan on akatitles (cost=0.00..7159.24 rows=28 width=4)"
" Output: akatitles.movieid, akatitles.language, akatitles.title, " Filter: (((title)::text ~~* '%Babe%'::text) AND ((title)::text !~~* '""%}'::text))"
" -> Index Scan using movies_pkey on movies (cost=0.00..8.44 rows=1 width=46)"
" Output: public.movies.movieid, public.movies.title, public.movies.year, public.movies.imdbid" " Index Cond: (public.movies.movieid = akatitles.movieid)"

SELECT * FROM ((SELECT DISTINCT title, movieid, year FROM movies WHERE title ILIKE '%Babe%' AND NOT (title ILIKE '"%}')) UNION (SELECT movies.title, movies.movieid, movies.year FROM movies INNER JOIN akatitles ON movies.movieid=akatitles.movieid WHERE akatitles.title ILIKE '%Babe%' AND NOT (akatitles.title ILIKE '"%}'))) AS union_tmp2;

Returns 612 Rows in 9078ms
Database backup (plain text) is 1.61GB

It's a really complex query and I am not fully cognizant on it, like I said it was spat out by JMDB. Do you have any suggestions on how I can increase the speed ?

Regards
Anthoni

+1  A: 

This is your problem:

" -> Seq Scan on movies (cost=0.00..39093.65 rows=70 width=46)"
" Output: public.movies.title, public.movies.movieid, public.movies.year"
" Filter: (((title)::text ~~* '%Babe%'::text) AND ((title)::text !~~* '""%}'::text))"

A sequential scan and huge costs because the database can't use any index on '%Babe%'. Take a look at full text search, than you can create a proper index and let the queryplanner use it.

Frank Heikens
+1  A: 

A query that uses double-ended wild-cards (e.g. '%Babe%') cannot leverage any index, so the table will result in a sequential scan instead of an index scan.

If you were searching for 'Babe%', then your index should work.

Tim Drisdelle
A: 

Hi,

OK really sorry, but I can not find the edit link anywhere to allow me to reply to your answers, so I am having to put it here.

So it looks like I need to do FullText Searching / Indexing, but I am relatively new to all this. Is this complex to do and will it take long ?

Regards Anthoni

Anthoni Gardner