tags:

views:

1503

answers:

5

MySQL's explain output is pretty straightforward. PostgreSQL's is a little more complicated. I haven't been able to find a good resource that explains it either.

Can you describe what exactly explain is saying or at least point me in the direction of a good resource?

+6  A: 

It executes from most indented to least indented, and I believe from the bottom of the plan to the top. (So if there are two indented sections, the one farther down the page executes first, then when they meet the other executes, then the rule joining them executes.)

The idea is that at each step there are 1 or 2 datasets that arrive and get processed by some rule. If just one dataset, that operation is done to that data set. (For instance scan an index to figure out what rows you want, filter a dataset, or sort it.) If two, the two datasets are the two things that are indented further, and they are joined by the rule you see. The meaning of most of the rules can be reasonably easily guessed (particularly if you have read a bunch of explain plans before), however you can try to verify individual items either by looking in the documentation or (easier) by just throwing the phrase into Google along with a few keywords like EXPLAIN.

This is obviously not a full explanation, but it provides enough context that you can usually figure out whatever you want. For example consider this plan from an actual database:

v3orders=> explain analyze select a.attributeid, a.attributevalue, b.productid from orderitemattribute a, orderitem b where a.orderid = b.orderid and a.attributeid = 'display-album' and b.productid = 'ModernBook';



------------------------------------------------------------------------------------------------------------------------------------------------------------

 Merge Join  (cost=125379.14..125775.12 rows=3311 width=29) (actual time=841.478..841.478 rows=0 loops=1)

   Merge Cond: (a.orderid = b.orderid)

   ->  Sort  (cost=109737.32..109881.89 rows=57828 width=23) (actual time=736.163..774.475 rows=16815 loops=1)

         Sort Key: a.orderid

         Sort Method:  quicksort  Memory: 1695kB

         ->  Bitmap Heap Scan on orderitemattribute a  (cost=1286.88..105163.27 rows=57828 width=23) (actual time=41.536..612.731 rows=16815 loops=1)

               Recheck Cond: ((attributeid)::text = 'display-album'::text)

               ->  Bitmap Index Scan on (cost=0.00..1272.43 rows=57828 width=0) (actual time=25.033..25.033 rows=16815 loops=1)

                     Index Cond: ((attributeid)::text = 'display-album'::text)

   ->  Sort  (cost=15641.81..15678.73 rows=14769 width=14) (actual time=14.471..16.898 rows=1109 loops=1)

         Sort Key: b.orderid

         Sort Method:  quicksort  Memory: 76kB

         ->  Bitmap Heap Scan on orderitem b  (cost=310.96..14619.03 rows=14769 width=14) (actual time=1.865..8.480 rows=1114 loops=1)

               Recheck Cond: ((productid)::text = 'ModernBook'::text)

               ->  Bitmap Index Scan on id_orderitem_productid  (cost=0.00..307.27 rows=14769 width=0) (actual time=1.431..1.431 rows=1114 loops=1)

                     Index Cond: ((productid)::text = 'ModernBook'::text)

 Total runtime: 842.134 ms

(17 rows)

Try reading it for yourself and see if it makes sense.

What I read is that the database first scans the id_orderitem_productid index, using that to find the rows it wants from orderitem, then sorts that dataset using a quicksort (the sort used will change if data doesn't fit in RAM), then sets that aside.

Next it scans orditematt_attributeid_idx to find the rows it wants from orderitemattribute and then sorts that dataset using a quicksort.

It then takes the two datasets and merges them. (A merge join is a sort of "zipping" operation where it walks the two sorted datasets in parallel, emitting the joined row when they match.)

As I said, you work through the plan inner part to outer part, bottom to top.

+4  A: 

This document could help too.

Milen A. Radev
I found this document very helpful.
KJG
+2  A: 

PgAdmin3 will show you a graphical representation of the explain plan. Switching back and forth between the two can really help you understand what the text representation means. However, if you just want to know what it is going todo, you may be able to just always use the GUI.

Grant Johnson
+2  A: 

There is an online helper tool available too, explain-analyze.info, which will highlight where the expensive parts of the analysis results are.

I copied and pasted the explain analyze results from bentilly's answer, and it highlighted all rows as having mismatches between estimated and actual rows, and recommended analyzing the database.

Depesz also has one, here's the same results, which to me make it clearer where the problem is.

Stephen Denne
+1  A: 

If you install pgadmin, there's an Explain button that as well as giving the text output draws diagrams of what's happening, showing the filters, sorts and sub-set merges that I find really useful to see what's happening.

Greg