views:

403

answers:

17

My company has me working on finishing a back end for Oracle for a Python ORM. I'm amazed at how much differently RDBMSes do things even for the simple stuff. I've learned a lot about the differences between Oracle and other RDBMSes. Just out of sheer curiosity, I'd like to learn more.

What are some common "gotchas" in terms of porting SQL from one platform to another?

Please, only one gotcha per answer.

+3  A: 

Oracle does not allow a select statement without a FROM clause. Therefore, you cannot do queries like this:

SELECT 1

Instead, you have to say that the query is from the DUAL table:

SELECT 1 FROM DUAL
Jason Baker
Does Oracle allow WHERE 1 or does it require WHERE 1=1?
jmucchiello
@jmucchiello: no, it doesn't. Neither it has BOOLEAN datatype in SQL (but does have in PL/SQL).
Quassnoi
A: 

Different databases handle binary data a bit differently. So for instance, this will work under MySQL:

mysql> CREATE TABLE t (c BINARY(3));
Query OK, 0 rows affected (0.01 sec)

mysql> INSERT INTO t SET c = 'z';
Query OK, 1 row affected (0.01 sec)

However, Oracle relies on these values being hex:

SQL> CREATE TABLE t (c RAW(3));

Table created.

SQL> INSERT INTO t VALUES ('z');
INSERT INTO t VALUES ('z')
                 *
ERROR at line 1:
ORA-01465: invalid hex number

Instead, we have to convert it to hex:

SQL> INSERT INTO t VALUES (rawtohex('z'));

1 row created.
Jason Baker
+2  A: 

Another example is generating unique (typically surrogate) primary keys.

Many databases, such as SQL Server and sqlite allow a column to be declared as an identity: typically, if the value for this column is missing on insert, the database will generate a unique value for the column.

Oracle, by contrast, has you create a sequence separate to the table and then use nextval on the sequence to generate the next value:

CREATE SEQUENCE test_seq;
SELECT test_seq.nextval FROM dual;

Or, more typically:

INSERT INTO foo(id, title) VALUES (test_seq.nextval, 'bar');
Dan Bennett
PostgreSQL supports the SERIAL data type. As far as I know, this is implemented through a sequence. MySQL has the AUTOINCREMENT attribute for a column as well as a compatibility shim for SERIAL (which becomes a BIGINT AUTOINCRIMENT column).
R. Bemrose
+2  A: 

What are some common "gotchas" in terms of porting SQL from one platform to another?

Same as attempts to translate from English to Russian by substituting the dictionary words right into the phrases.

Works for hello and goodbye, but fails for Mary had a little lamb to say nothing of Shakespeare.

Different RDBMS's have different cultures, despite having SQL in their name.

For instance, the row limiting.

In Oracle:

WHERE rownum = 1

In SQL Server:

SELECT TOP 1

In MySQL and PostgreSQL:

LIMIT 1

In DB2:

SELECT * ... FETCH FIRST 1 ROW ONLY

Four different clauses.

Quassnoi
Especially since MySQL's limit clause also lets you specify a start row (e.g. LIMIT 20, 10) which makes pagination really easy to do, but is a pain on SQL server.
Eric Petroelje
And don't forget ROWNUM doesn't work like you might think. WHERE ROWNUM > 1 will cause Oracle to return an empty set since it numbers the rows LAST so no row is ever number 1 so no row can ever be greater than 1.
jmucchiello
@jmucchiello: http://explainextended.com/2009/05/06/oracle-row_number-vs-rownum/
Quassnoi
That's what I said. That's for the link about using row_number().
jmucchiello
@jumucchiello: it was a link to an article in my blog, just to show that I don't think so :)
Quassnoi
+2  A: 

Multi-value IN clause queries. I used to use these all the time on Oracle and was surprised to find that you can't do this in SQL Server. For example, this query:

SELECT * FROM mytable WHERE (col1, col2) IN ( SELECT col1, col2 FROM othertable )
Eric Petroelje
+1  A: 

Performance issues is a big one. For instance, views in Oracle are, AFAIK, pretty much as fast as tables. This was not the case with SQLServer when I had to use them. The views effectively killed the performance slowing down the same selects by an order of magnitude or more (a query straight from the tables took, say 0.5s, while using a view could take a minute). There were also a lot of limitations on their use, for instance not all SQL functions could be used on views.

Note that this was true 5-6 six years ago, I don't know if Microsoft has improved this since.

Makis
I think indexed views in SQL Server 2005+ help quite a bit with the performance issues.
Eric Petroelje
+1  A: 

Obscure join syntax, like Oracle's (+) syntax for outer joins. At a company I used to work for this syntax was used all over the place rather than the standard LEFT OUTER JOIN / LEFT JOIN syntax, which made porting some stuff over to MySQL quite a pain.

Eric Petroelje
ANSI JOIN's got into Oracle only by 9i, and it's been but a year since we ceased 8i support and are able to use them for all updates.
Quassnoi
The Oracle syntax is standard old-style SQL. The LEFT|RIGHT OUTER|INNER JOIN syntax was added to SQL later. There is some discussion of it in the answers to this question: http://stackoverflow.com/questions/894490/sql-left-join-vs-multiple-tables-on-from-line
jmucchiello
Interesting, it still seems that even the old-style syntax was inconsistent - with Oracle using (+) and SQL Server using *=
Eric Petroelje
And boy you don't want to port that to SQL server even changing to the equivalent syntax *= or =* as it does not produce correct results all the time (sometimes it will interpret as a cross join).
HLGEM
+2  A: 

Oracle will not allow you to insert empty strings: they are silently converted to NULL.

RedFilter
Didn't realize this. Thanks!
Jason Baker
That's why we are having VARCHAR2's. Oracle advises against using VARCHAR for the current releases, since they are reserving it to be ANSI compatible in the future (i. e. some day VARCHAR will distinguish between NULL and an empty string)
Quassnoi
A: 

Set operator support.

Aside from UNION / UNION ALL, set operator support is pretty spotty across databases. Oracle and SQL server support most of them, but Oracle supports a MINUS operation as well as the equivalent standard EXCEPT DISTINCT operation. AFIK, MySQL only supports UNION (no INTERSECT or EXCEPT support).

Eric Petroelje
I believe that here again, as mentioned elsewhere, MINUS predates the inclusion of EXCEPT in standards. MINUS goes back a long way: I remember using it in Oracle 5.1.22 at least, some time around 1989...
Mike Woodhouse
+1  A: 

Oracle does not allow you to have multiple inserts in one query. MySQL allows this:

INSERT INTO test(id, name) VALUES (1, 'foo'),(2, 'bar');
Spidey
Well, it does - sorta. You can use a BEGIN...END: PL/SQL block or array processing. Neither is exactly a heap of fun.
Mike Woodhouse
Neither are as elegant ;), and with PL/SQL you'd want to use bind variables to avoid reparsing the cursor on each insert..
Spidey
A: 

I'm not sure if this is true in the drivers that come with Python, but in the version of "Horrable" I used, if a column in a resultset returns NULL for all values in the column, the column itself is not returned in the structure of the resultset. This could (and often did) result in production bugs that were impossible to duplicate in development.

Since you're moving away from Oracle, this shouldn't be a concern, but there is a keyword developers use to mitigate this "feature" that I doubt is supported in MySQL. I forget what it is, though, and Google is not helping.

Chris McCall
+2  A: 

Oracle has a different approach to quotes opposed to MySQL.

MySQL: `object_name`, 'string', "string"
Oracle: "object_name", 'string'

Furthermore, escaping is different.

MySQL: 'It\'s easy'
Oracle: 'It''s slightly confusing'

(Note that to escape anything other than quotes in Oracle, you can use the ESCAPE directive in your query; SELECT * FROM testTable WHERE percent = '50\%' ESCAPE '\')

Spidey
MySQL may support \' to insert a literal quote, but the SQL standard (since SQL89 I believe) has used ''
R. Bemrose
That's somewhat counter-intuitive. Still, a valid gotcha ;)
Spidey
And SQL Server uses square brackets to escape object names. Also note that Oracle doesn't allow you to escape a double quote in an object name.
Jason Baker
+4  A: 

Oracle does not seem to have a problem with cursors, they are a huge performance problem in SQL server.

Actually pretty much all performance tuning is database specific (which is why ANSII standard code often performs very poorly compared to the better methods designed into the specifc flavor of SQL that is database specific).

Dates are another thing that seem to be handled very differntly from database to database.

Datatypes are not equivalent either. One thing that tends to get newcomers to SQL Server is that timestamp data type has absolutely nothing to do with dates and times and cannot be converted to a datatime value.

HLGEM
The timestamp thing was confusing to me as well coming in to SQL Server. After seeing how they worked, my first thought was why would anyone ever use these things?
Eric Petroelje
@HLGEM: in SQL Server, it's not the cursors that cause performance problem, it's that they are misused. A STATIC cursor is in fact MORE efficient than a temp table (which is generally adviced as a replacement). http://explainextended.com/2009/06/28/the-island-of-misfit-cursors/
Quassnoi
SQL Server has a new name for TIMESTAMP, ROWVERSION. Makes more sense. Please use. The TIMESTAMP/ROWVERSION column is useful for optimistic "locking." Not sure on the correct term. Front end grabs a record but does not lock. User modifies record. When saved, the front end checks that the ROWVERSION is the same as what was read. If not, someone else has written to that rwo, and the front end can error out instead of overwriting.
Shannon Severance
+1  A: 

A list of discrepancies between SQL Server and Oracle that I have encountered when porting:

Think ANSI Standard SQL Is Fully Portable Between Databases? Think Again.

Writing ANSI Standard SQL is not practical

AlexKuznetsov
+1  A: 

Temporary Tables - Oracle vs SQL Server/MySQL. Transitioning from Oracle to MS/MySQL, no problem. The other way around, a little different.

Steve
+1  A: 

For the big picture on product specific gotchas, you need to learn the difference between logical database design and physical database design.

Logical database design mostly has to do with features of tables. Features of tables include columns and constraints. Although tables themselves are physical, table design is generally very portable from one database system to another. There are differences in the way some datatypes work, and a few differences in syntax, like whether the underscore can be used in a table name or not. But a good logical design should port from one system to another with only minor changes or no changes.

Physical database design mostly has to do with features of the infrastructure on which the table structure rests. Almost all systems support indexes, and the default index type is B-tree although it might be called something else. But from there on out, each system has its own physical features, which can be completely different from one system to another. A typical physical feature of Oracle is tablespaces. Closely associated with tablespaces is the mapping between tables and tablespaces. Physical design has to be done on a system specific basis.

In addition to which RDM system you are using, you need to take data volume, load, response time requirements, and system resources like disks into account in your design. The good news is that a lot of changes to physical design can be made with no changes to application code. This is known as physical data independence. This means that you are somewhat free to tweak and tune the physical desing after you've got some application code written and some data loaded.

You might want to look at some books on database design to get a deeper idea of logical and physical design and the difference between them. Some popular authors are C.J. Date and Joe Celko.

Walter Mitty
Any chance you could recommend some particular books? This is a subject I'd really be interested to learn more about.
Jason Baker
+1  A: 

I recall a peculiar Oracle problem that got me completely off gueard. I'm not sure it this was a configuration of the instance or a default settings, but we could not have more that 1000 elements in a IN statement. So we had to trick it into doing what wanted:

SELECT Col1,Col2 
FROM Table
WHERE Code IN (1,2,3,...,1000)
OR Code IN (1001,1002,1003,...,2000)

etc.

Ugly, but it worked.

(Before anyone points out the obvious solution of a sub-query or inline view, the query was generated on a completely different system)

feihtthief