views:

89

answers:

5

Hi - I have a query that works on MySQL but doesn't work on Oracle, and I'm trying to convert. This is my table:

unique_row_id  http_session_id  page_name   page_hit_timestamp
----------------------------------------------------------------
0              123456789        index.html  2010-01-20 15:00:00
1              123456789        info.html   2010-01-20 15:00:05
2              123456789        faq.html    2010-01-20 15:00:15
3              987654321        index.html  2010-01-20 16:00:00
4              987654321        faq.html    2010-01-20 16:00:05
5              987654321        info.html   2010-01-20 16:00:15
6              111111111        index.html  2010-01-20 16:01:00
7              111111111        faq.html    2010-01-20 16:01:05
8              111111111        info.html   2010-01-20 16:01:15

The SQL is

select http_session_id, unique_row_id, page_name, page_hit_timestamp 
from page_hits 
group by http_session_id;

On MySQL, this will return 3 rows (one for each unique http_session_id).

On Oracle, I get a "ORA-00979: not a GROUP BY expression" error. I've tried playing around with distinct too, but I can't get it to work.

Just to be clear - I would like a ResultSet that contains one row per unique http_session_id. It is preferable that the unique_row_id would be the max one (e.g. 2 for http_session_id==123456789), but this is not significant.

I'm on the verge of breaking this into multiple separate sql statements (one "select distinct http_session_id", and the other to iterate through all these and select the max(unique_row_id). Any pointers would be gratefully received - I would love to avoid this!

Rgds, Kevin.

A: 

I think that a GROUP BY requires a variable to have been used in a WHERE clause or aggregation function in the SQL standard?

Try using SELECT MAX(unique_row_id) GROUP BY http_session_id.

Borealid
+2  A: 

Would this work:

select max(unique_row_id), http_session_id
from page_hits
group by http_session_id

Incidentally; what does my sql return in your resultset for columsn that are included in the resultset but not in the group by clause? (page_name, page_hit_timestamp)

Dog Ears
Hi - yep, this does the trick. In full:select http_session_id, max(unique_row_id), max(page_name), max(page_hit_timestamp) from page_hits group by http_session_id;
Kevin
And to answer your question about what MySQL does :- it seems to return the columns from the row associated with the first matching http_session_id. I can't be certain because I haven't tested it too much - it could well return last under different conditions.
Kevin
selecting max(page_name) would give you the alphabetically highest page name is that what you want if your're just after the first page name you may get better performance with first(page_name), or even last(page_name) It just seems to me that the max may not be what you're after..?
Dog Ears
@Dog Ears - yep, I'll change to first(page_name).
Kevin
+8  A: 

The reason you encounter the ORA error is because MySQL supports non-standard GROUP BY clauses, calling it a "feature". It's documented here.

The standard SQL GROUP BY clause must include ALL columns specified in the SELECT clause, that are not wrapped in aggregate functions (LIKE COUNT, MAX/MIN, etc), to be specified in the GROUP BY clause.

If you want one, unique row per http_session_id value - look at using ROW_NUMBER:

SELECT x.*
  FROM (select http_session_id, unique_row_id, page_name, page_hit_timestamp,
               ROW_NUMBER() OVER (PARTITION BY http_session_id 
                                      ORDER BY http_session_id) AS rank
          FROM page_hits) x
 WHERE x.rank = 1
OMG Ponies
You can add an ORDER BY to the ROW_NUMBER if you want to tweak which row is returned. MySQL doesn't have analytical/windowing/rank functionality, so you can't port the query back to MySQL I'm afraid :(
OMG Ponies
Thanks for the comprehensive answer - useful to understand that MySQL's Group By is non-standard. BTW - I got a ORA-30485 error executing the select clause specified.
Kevin
I believe that select is SQL Server syntax.
ranomore
OMG Ponies
@ranomore: No, see my reply to Kevin. Oracle has supported analytic/ranking/windowing functions since 9i, while SQL Server started in v2005...
OMG Ponies
A: 

In standard SQL, if you have a GROUP BY clause, all columns that are not part of it have to be in aggregates. In MySQL, this rule was relaxed by design.

For instance, this is allowed in MySQL but not in standard SQL:

SELECT customer_id, country, SUM(amount) FROM records GROUP BY customer_id

There's one caveat: MySQL assumes you know what you're doing. If the same customer has records in multiple countries, the query will simply grab the first country in the table, disregarding all others. Furthermore, since the order of rows is undefined, and there is no ORDER BY, you might get different results each time you run the query.

In Standard SQL, you have two choices:

SELECT customer_id, country, SUM(amount) FROM records GROUP BY customer_id, country

or

SELECT customer_id, MIN(country), SUM(amount) FROM records GROUP BY customer_id
MapDot
A: 

Another option in Oracle, if you want:

select DISTINCT
       FIRST_VALUE(unique_row_id)
       OVER (PARTITION BY http_session_id
             ORDER BY unique_row_id DESC) unique_row_id,
       http_session_id,
       FIRST_VALUE(page_name)
       OVER (PARTITION BY http_session_id
             ORDER BY unique_row_id DESC) page_name,
       FIRST_VALUE(page_hit_timestamp)
       OVER (PARTITION BY http_session_id
             ORDER BY unique_row_id DESC) page_hit_timestamp
from page_hits;

This will get a distinct set of http_session_id's, and for each, returns the unique_row_id, page_name and page_hit_timestamp from the row with the greatest unique_row_id for that http_session_id, e.g.:

unique_row_id  http_session_id  page_name   page_hit_timestamp
----------------------------------------------------------------
2              123456789        faq.html    2010-01-20 15:00:15
5              987654321        info.html   2010-01-20 16:00:15
8              111111111        info.html   2010-01-20 16:01:15
Jeffrey Kemp