tags:

views:

67

answers:

1

Hey everyone,

I'm pretty new to the SQL and need some help with a certain task. I have a query that is called by our Flex/Java code that joins multiple tables to get information. Upon running an Explain Analyze, I've seen that the query takes over 15 minutes which sometimes even times out if the site is under heavy traffic. What I am trying to do is to create a rule, or a trigger, that sees if the query is called and selects the data from the materialized view table instead. I tried implementing a rule for this, but was having a hard time when I received errors when trying to join multiple tables in the rule. Basically is there a way, whether it be a rule, trigger, or anything else I have not looked into, for me to see if a certain query gets executed that I can replace that query with one from another table? Thanks in advance.

Example of the query:

select
   player_id,
   player_names,
   player_level,
   current_location
from
   server_info
   join players using (player_id)
   join locations using (location_id)
where
   current_location = 'Central Hub'
+2  A: 

Why not just change the app to source the materialized view if that is what you want? There is no way to rewrite a complex join to source an unrelated table -- that I know of. And, if you could do you really want to have deep voodoo like that in your query planner/rule system? SQL VIEWs are implemented with a SELECT RULE, but that doesn't get you from complex-query to materialized view. It gets you from TABLE1, to TABLE2.

A better question would be what are the types on player_id, and location_id, are they both int and do they both have btree indexes? Is current_location indexed, and is it being used for equality, regex, or LIKE conditionals? What version of Postgresql are you using? Can you attach an EXPLAIN ANALYZE of the query that causes the massive wait time?

Evan Carroll
the player_id and location_id are both int, they do not have btree indexes, and the location_id is indexed. The query does use equality conditions. The server uses version 8.3 and I just tried running an explain analyze and it timed out.Pretty much we are trying to find a way to go from complex-query to materialized view, but we're trying to do it from the database itself since we can't rebuild the app due to corrupted files.
Seb
You have to have `btree` indexes in `player_id` and `location_id` **IN BOTH TABLES**, or your query will be slow as dirt. Go into psql and attach the output of `\d server_info`, `\d players`, and `\d locations` (cut out the table description above the word *Indexes:*), also USING is an quality test (shorthand), and so is `current_location = 'Central Hub'` from your example, so please re-answer, *Is current_location indexed, and is it being used for equality, regex, or LIKE conditionals?*
Evan Carroll
The locations table has the location_id, which is an indexed int, the current_location, which is just a text field that shows the town name, and a map, which is another text field used for the application to specify which map file to load. From what you've told me it sounds as if the application is running queries that use equality conditionals. I can't attach the output since I have no direct access to PostgreSQL. I only have access to the application and the query logs generated from the application. I do apologize for not being able to help out much.
Seb
Definitely need the indexes there. I bet that fixes 99% of the issue.
rfusca
@Seb `USING (a)` is just a shorthand, ie. `SELECT lhs.a FROM lhs JOIN rhs ON (lhs.a = rhs.a)` is the same as `SELECT a FROM lhs JOIN rhs USING (a)`. You're testing when two things are equal you need to have indexes on both columns in the equality to avoid the seq scan. Just add the index, `analzye` the tree tables, and try again.. Better yet, just `vacuum analyze` the three tables.
Evan Carroll
@Evan: I added in the indexes but the vacuum analyze won't happen until tonight 8pm EST. We don't want to cause any problems with the users on the server right now.
Seb
@Seb, Cool, if you're going to run it over night, edit `postgresql.conf`, set the `default_statistics_target` to `1000` run an unqualified `vacuum full analyze`. You can also probably raise `maintenance_work_mem` to something more reasonable like (256MB, and it will finish much faster tonight. -- I usually drastically raise maintenance, work, sharred, and max_stack_depth, for dedicated sql servers of any magnitude the default values are way to low, I bought mine at ofc. max 2 years ago for 600.00
Evan Carroll
@Evan: We ran the vacuum analyze last night after adding the indexes and changing the query to point at the IDs instead of the actual title and I must say that we saw a good increase in speed. We're still going to redo most of our database since we're transferring to newer servers. Thank you for all your help.
Seb
@Seb, Right, well that db schema, and query looks pretty good from this vantage point. I'd suggest you ask questions until a more experienced DB user inspires change. It wouldn't be any good if you moved to a different worse schema because of a misunderstanding with indexes or DB tuning. Personally, I'd be shocked if you didn't just pick up a 20 times increase in speed.
Evan Carroll