views:

129

answers:

5

Chasing down some DB performance issues in a fairly typical EclipseLink/JPA application.

I am seeing frequent queries that are taking 25-100ms. These are simple queries, just selecting all columns from a table where its primary key is equal to a value. They shouldn't be slow.

I'm looking at the query time in the postgres log, using the log_min_duration_statement so this should eliminate any network or application overhead.

This query is not slow, but it is used very often.

Why would selecting * by primary key be slow? Is this specific to postgres or is it a generic DB issue? How can I speed this up? In general? For postgres?

Sample query from the pg log:

2010-07-28 08:19:08 PDT - LOG:  duration: 61.405 ms  statement: EXECUTE <unnamed>  [PREPARE:  SELECT coded_ele
ment_key, code_system, code_system_label, description, label, code, concept_key, alternate_code_key FROM coded
_element WHERE (coded_element_key = $1)]

Table has around 3.5 million rows.

I have also run EXPLAIN and EXPLAIN ANALYZE on this query, its only doing an index scan.

+2  A: 

Could you be getting some kind of locking contention? What kind of locks are you taking when performing these queries?

djna
I'm not sure. Is there some way to get PG or JPA to tell me when a lock is issued?
Freiheit
+2  A: 

Well, I don't know much about postgres SQL, so I'll give you a tip for MS SQL Server which might be applicable.

MS SQL Server has the concept of a "cluster index" which is the physical layout of the data on the disk. It's good to use on field where you'll be seeking a range between to values (date fields mostly). It's not much use if you're looking for a exact value (like a primary key lookup). However, sometimes the primary key index is inadvertantly set as a clustered index. This makes an index lookup into a table scan.

James Curran
How does using a clustered index for a primary key turn an index lookup on primary key into a table scan?
Shannon Severance
Because the table is the index. Therefore "Index scan" == "Table Scan"
James Curran
@James Curran: But and index lookup is not going to be an index scan, unless I have my terminology screwed up.
Shannon Severance
"Index lookup" = "scan index, use info to jump directly to location in table"
James Curran
+3  A: 

Select * makes your database work harder, and as a general rule, is a bad practice. There are tons of questions/answers on stackoverflow talking about that.

have you tried replacing * with the field names?

dave
JPA generates queries as you describe with the field names. I described it as "Select *" for brevity.
Freiheit
Ah -- you didn't deserve that then :-) There's a million other things that can affect permrmance... compacting tables or rebuilding indexes can help, setting index padding to be more favorable to reading than writing, partitioning (is it on a slower drive), computed columns, changing the field-order in the SELECT (why??? I still don't know), etc.
dave
Accepted mostly for the followup comment rather than the initial answer. Gave me a list of things to understand and examine to speed the query up.
Freiheit
I'll be interested to know what solutions you eventually discover/implement that help the most.
dave
+1  A: 

select * is almost always a very very bad idea.

  1. If the order of the fields changes, it will break your code. According to comments, this isn't really important given the abstraction library you're using.
  2. You're probably returning more data from the table than you actually want. Selecting for the specific fields you want can save transfer time.

25ms is about the lower bound you're going to see on almost any kind of SQL query -- that's only two disk accesses! You might want to look into ways to reduce the number of times the query is run rather than trying to optimize the query.

Billy ONeal
I really wish people would freaking COMMENT when they downvote!
Billy ONeal
@Billy, you are wrong in two accounts: 1. Field order does not change unless you drop the table. 25ms for PK is very long time. That was not the reason for the downvote, however. You did not answer the question, which was how to solve the issue, not how many random disk accesses you get in 25ms.
jmz
@jmz: ALTER can also change the field order. As for the question, I believe it was `Why would selecting * by primary key be slow?`, followed by `How can I speed this up?`, in which case I fail to see how this does not answer the question.
Billy ONeal
@Pascal: I do not have experience with JPA; only with RDBMSs in general. If you can point me to how JPA changes the game here, I will delete the answer.
Billy ONeal
@Billy Apologies, I really wanted to removed my comment, it was just a useless one. Looks like something went wrong during deletion (probably me). Anyway, to answer your question, a JPQL query `SELECT * FROM MyEntity` will be translated into a `SELECT somealias.field1, ..., somealias.fieldN from MyEntityTable somealias` SQL query so the order of fields is actually not an issue at all (first bullet point). Regarding the second one, I wonder if it applies since the OP is using `log_min_duration_statement`. This is what I should have written as first comment.
Pascal Thivent
@Pascal: Ah. When I say transfer time, I mean "from the disk", not necessarily network transfer or anything like that. But I'm not positive on the mechanics of that specific setting.
Billy ONeal
All log_min_duration_statement does is that it only logs SQL statements that take longer than its value. So if I set it to 50 I only see statements that take more than 50ms . 0 shows all statements.
Freiheit
+1  A: 

The the row unusually large or contain BLOBs and large binary fields?

Is this directly through console or is this query being run through some data access API like jdbc or ADO.NET? You mention JPA that looks like a data access API. For short queries, data access API become a larger percent of execution time-- creating the command, creating objects to hold the rows and cells, etc.

MatthewMartin
These query speeds are being recorded from the postgres logs, so it is actual query time and not the data API (JPA) time.
Freiheit