ansaurus

Question

Answer 1

+2 A:

It means that it can't use any index (or similar method) to make the join efficient, so as a last resort is materializes the result from one of the tables to have a smaller set to work with when joining against the other table.

Guffa 2010-06-12 19:51:50

That's only partially correct. In many (most) cases it will *not* put a materialize node in just because it can't use an index - it will just work off a sequential scan on each table. Materializing the output of a seqscan doesn't make a huge difference - I find it more common to show up higher up in the tree than just above a scan node.

Magnus Hagander 2010-06-13 10:15:45

@Magnus: Yes, if there is any other method to join the tables, it will use it, but in this case there obviously isn't, and it will put in the materialize as the last option.

Guffa 2010-06-13 10:46:20

No, it won't do that as the last option. The last option is to just do the sequential scan and not materialize at all. It does it because it can drive a smarter plan that way, and doesn't have to fall back to the last-resort.

Magnus Hagander 2010-06-14 10:45:59

@Magnus: The materialize is always inserted as the last option: "If the inner input set of a Merge Join operator is not produced by a Seq Scan, an Index Scan, a Sort, or a Materialize operator, the planner/optimizer will insert a Materialize operator into the plan." http://www.iphelp.ru/faq/15/ch04lev1sec3.html

Guffa 2010-06-14 12:06:20

@Guffa: That's only true if the join is a mergejoin. Which is not likely to be chosen as a last-resort plan. It can happen as last-resort, but not with a seq-scan (as we have here) - which the text you're quoting actually tells you. BTW, The version of the book you're referring to is for ancient versions of PostgreSQL - I'd check up at least twice on most of the things you find in there.

Magnus Hagander 2010-06-14 17:48:13

@Guffa: My understanding is that a Materialise node is inserted only if (a) there is a cheap subplan that would need to scan its input rows more than once (such as the inner side of a plain ol' nested loop join) and (b) it would be expensive to regenerate those rows (such as if the input comes from a sequential scan of a large table but is expected to produce a small number of rows).

j_random_hacker 2010-06-15 04:48:18

Answer 2

+4 A:

A materialize node means the output of whatever is below it in the tree (which can be a scan, or a full set of joins or something like that) is materalized into memory before the upper node is executed. This is usually done when the outer node needs a source that it can re-scan for some reason or other.

So in your case, the planner is determining that the result of a scan on one of your tables will fit in memory, and it till make it possible to choose an upper join operation that requires rescans while still being cheaper.

Magnus Hagander 2010-06-13 10:14:16

I like this answer better - there's more reason to materializing than just not using an index.

rfusca 2010-06-14 04:36:37

An easy way to explain it is that it stores the intermediate result.

Grant Johnson 2010-06-17 18:25:25

ansaurus

tags:

views:

answers:

SQL explain plan: what is Materialize?

related questions