views:

639

answers:

7

I'm getting increasingly frustrated with the limitations and verbosity required to actually commit some business logic to stored procedures, using languages such as Transact-SQL or PL/SQL. I would love to convert some current databases to Oracle and take advantage of its support for Java stored procedures, but that option is not available at the moment.

What alternatives would you recommend in the way of databases that support stored procedures in other languages?

+3  A: 

PostgreSQL has support for many scripting languages procedures. Officially Perl, Python, and Tcl. As addons, PHP, Ruby, Java and probably many others (just Google for pl<languagename>) which may or may not be in working condition as of now.

Oh, and also SQL Server 2005 onwards has support for CLR stored procedures, where you can use .NET languages.

Vinko Vrsalovic
+1  A: 

Oracle, HSQLDB and Derby allow to write stored procedures in Java.

Aaron Digulla
+19  A: 

There are some architectural obstacles to having more clever query languages in a database manager. The principal one is the query optimiser. One of the design constraints on SQL is that it can only use constructs that are accessible to the query optimiser. This means that the language and its capabilities are quite tightly coupled to the capabilities of the query execution engine and query plan optimiser.

The other major design constraint is the mechanical nature of the database system - database programming is almost unique in that it has a mechanical component. Query performance is limited by the mechanical constraints of disk head seeks and rotational latency (the wait time before the data you want arrives under the heads).

This effectively precludes many clever abstractions that might make SQL more powerful or easier to work with. Many database management systems supplement SQL with procedural alternatives that can be used for scripting. However, they interact with the DBMS by executing a series of SQL queries that are processed by the optimiser individually. Some languages of this type that are shipped with various DBMS platforms are:

  • Oracle's PL/SQL and embedded Java. PL/SQL is actually based on Ada - it is quite 'old school' by modern standards and has a legacy code base with which it must remain backwardly compatible. It's not necessarily the most pleasant programming environment but it does have constructs for facilities such as parallelism and a reasonably flexible type system. One of the major criticisms of Java stored procedures on Oracle is that you are paying for Oracle's capacity based licensing on the CPU you are running the JVM's on.

  • SQL Server CLR Integration. Somewhat similar to Oracle's Java Stored Procedures, this allows CLR modules compiled from C# (or any .net language) to be loaded into a SQL Server instance and executed in much the same way as stored procedures. SQL Server also has PostgreSQL-style API's for making custom aggregate functions through CLR integration and other hooks for mixed SQL/CLR code bases.

  • PostgreSQL is actually the system where back-end language integration was originally developed. The system exports a native C API with facilities for custom aggregate functions, storage engines, procedural extensions and other functionality. The language interfaces are based on this API and include: PL/pgSQL (a bespoke language similar to PL/SQL), Python, Perl and Tcl.

    This made it into the mainstream through Illustra, a commercialised version of Postgres, which was then bought out by Informix (which was subsequently bought out by IBM). The key features were incorporated into Informix On-Line, which is still sold by IBM.

One key limitation of these languages is their limited interaction with the query optimiser (although the C API for PostgreSQL does have support for this). Participation in a query plan as first-class citizen requires that the query optimiser can work out a sensible view of the resources your action will take. In practice, this type of interaction with the query optimiser is mainly useful for implementing storage engines.

This level of digging into the storage engine is (a) somewhat esoteric if the functionality is available at all (so most people won't have the skill to do this) and (b) probably considrably more trouble than just writing the query in SQL. The limitations of the query optimiser mean that you will probably never get the level of abstration out of SQL that you might get from (say) Python or even C# or Java.

The path of least resistance for efficient queries is likely to be writing the query in SQL with some procedural glue in one of the other languages. In some cases a computation really does lend itself to a procedural approach.

This can become a hassle and lead to large bodies of boilerplate SQL code. The only real options for this are hand coded SQL or code generation systems. A trivial example of code generation is the CRUD functionality provided by frameworks where this SQL is generated from metadata. A more complex example can be seen in ETL tools such as Oracle Warehouse Builder or Wherescape Red which work by generating great screeds of stored procedure code from the model.

I find myself building code generation systems of one sort or another on a semi-regular basis for precisely this reason. Any templating system will do for this - I've had fairly good mileage from CherryTemplate but there are many such items around. Code Generation in Action is quite a good book on this subject - the author uses a ruby-based system who'se name escapes me.

Edit: If you look at a 'Show Estimated Execution Plan' for a block of procedural code you will notice that each statement has its own query plan. The query optimisation algorithm can only work on a single SQL statement, so a procedure will have a forest of query plans. Because procedural code can have 'side-effects' you cannot use the type of algorithms used in query optimisation to reason about the code. This means that a query optimiser cannot globally optimise a block of procedural code. It can only optimise individual SQL statements.

ConcernedOfTunbridgeWells
Wow for content! Thanks for all the links/pointers.
Learning
I agree - +1 for great content. I'm still confused on one point. Most of my stored procedures have many SELECT, INSERT, UPDATE and/or DELETE statements. I would think that query optimization would already be difficult at best, and not a significant concern in language selection.
Ken Paul
See the edit - the optimiser generates a series of plans for the individual select, insert, update and delete statements.
ConcernedOfTunbridgeWells
+1  A: 

Oracle does support CLR stored procedures so you can write stored procs in any .NET language like C#, VB.NET or IronPython. This only works when the database server runs on a Windows machine. You can't do it when the database runs on Linux or Unix.

A: 

DB2 for Z/OS is the database that support most languages as I know. It support COBOL,C/C++,JAVA as the store procedure, It of course also support SQL procedure.

A: 

There is also some support for writing Oracle stored procedures in Perl.

runrig
A: 

Because Oracle has a built in JVM you can develop stored procs in Java but also in non-java-languages that use the JVM, that means languages like JACL, JYTHON, SCHEME and GROOVY. See here: http://db360.blogspot.com/2006/08/oracle-database-programming-using-java_01.html and http://en.wikipedia.org/wiki/List_of_JVM_languages .

tuinstoel