I've recently come across a feature of doing a large query in oracle, where changing one thing resulted in a query that used to take 10 minutes taking 3 hours.
To briefly summarise, I store a lot of coordinates in the database, with each coordinate having a probability. I then want to 'bin' these coordinates into 50 metre bins (basically round the coordinate down to the nearest 50 metres) and sum the probability.
To do this, part of the query is 'select x,y,sum(probability) from .... group by x,y'
Initially I was storing a large number of points with a probability of 0.1 and queries were running reasonably ok, taking about 10 minutes for each one.
Then I had a request to change how the probabilities were calculated to adjust the distribution, so rather than all of them being 0.1, they were different values (e.g. 0.03, 0.06, 0.12, 0.3, 0.12, 0.06, 0.03). Running exactly the same query resulted in queries of about 3 hours.
Changing back to all 0.1 brought the queries back to 10 minutes.
Looking at the query plan and performance of the system, it looked like the problem was with the 'hash group' functionality designed to speed up grouping in oracle. I'm guessing that it was creating hash entries for each unique x,y,probability value and then summing probability for each unique x,y value.
Can anyone explain this behaviour any better?
Additional Info
Thanks to the answers. They allowed me to verify what was going on. I'm currently running a query and the tempseg_size from v$sql_workarea_active is currently at 7502561280 and growing rapidly.
Given that the development server I'm running on only has 8gb of ram, it looks like the query needs to use temporary tables.
I've managed to workaround this for now by changing the types of queries and precalculating some of the information.