tags:

views:

285

answers:

6

In a C++ application that can use just about any relational database, what would be the best way of generating queries that can be easily extended to allow for a database engine's eccentricities?

In other words, the code may need to retrieve data in a way that is not consistent among the various database engines. What's the best way to design the code on the client side to generate queries in a way that will make supporting a new database engine a relatively painless affair.

For example, if I have (MFC)code that looks like this:

CString query = "SELECT id FROM table"
results = dbConnection->Query(query);

and we decide to support some database that uses, um, "AVEC" instead of "FROM". Now whenever the user uses that database engine, this query will fail.

Options so far:

  • Worst option: have the code making the query check the database type.
  • Better option: Create query request method on the db connection object that takes a unique query "code" and returns the appropriate query based on the database engine in use.
  • Betterer option: Create a query builder class that allows the caller to construct queries without using any SQL directly. Once the query is completed, caller can invoke a "Generate" method which returns a query string approrpriate for the active database engine
  • Best option: ??

Note: The database engine itself is abstracted away through some thin layers of our own creation. It's the queries themselves are the only remaining problem.

Solution:
I've decided to go with the "better" option (query "selector") for two reasons.

  1. Debugging: As mentioned below, debugging is going to be slightly easier with the selector approach since the queries are pre-built and listed out in a readable form in code.
  2. Flexibility: It occurred to me that there are some databases which might have vastly better and completely different ways of solving a particular query. For example, with Access I perform a complicated query on multiple tables each time because I have to, but on Sql Server I'd like to setup a view. Selecting from the view and from several tables are completely different queries (i think) and this query selector would handle it easily.
+1  A: 

Best option: Pick a database, and code to it.

How often are you going to up and swap out the database on the back end of a production system? And even if you did, you'd have a lot more to worry about than just minor syntax issues. (Major stuff like join syntax, even datatypes can differ widely between databases.)

Now, if you are designing a commercial application where you want the customer to be able to use one of several back-end options when they implement it, then you may have to specify "we support Oracle, MS SQl, or MYSQL" and code to those specific options.

BradC
Understood but this isn't something where someone can just up and replace their backend database on a whim. This is purely to make it easier for us to support new database engines when the need arises (client demand or something)
Karim
+1 Karim, what if the client demands a Java application instead of C++ - will you swap that out too? No, didn't think so!
Tony Andrews
@Tony -- client's mostly don't care about implementation languages but they may already support a particular DB engine and not want to develop the capability in-house to manage another one.
tvanfosson
@Tony: It's much more likely that a client will have a favorite database vendor than a favorite implementation language.
Bill the Lizard
I'm not sure what sort of environment you sell your software in, but in mine, an established company could very well only allow DB2 on their network. We can either say no to a multi-million dollar contract or work with DB2.
Karim
+3  A: 

You need your own query-writing object, which can be inherited from by database-specific implementations.

So you would do something like:

DbAgnosticQueryObject query = new PostgresSQLQuery();
query.setFrom('foo');
query.setSelect('id');
// and so on
CString queryString = query.toString();

It can get pretty complicated in there once you go past simple selects from a single table. There are already ORM packages out there that deal with a lot of these nuances; it may be worth at looking at them instead of writing your own.

SquareCog
A: 

I would think that what you would want to do, if you needed the ability to support multiple databases, would be to create a data provider interface (or abstract class) and associated concrete implementations. The data provider would need to support your standard query operators and other common, supported functionality required support your query operations (have a look at IEnumerable extension methods in .NET 3.5). Each concrete provider would then translate these into specific queries based on the target database engine.

Essentially, what you do is create a database abstraction layer and have your code interact with it. If you can find one of these for C++, it would probably be worth buying instead of writing. You may also want to look for Inversion of Control (IoC) containers for C++ that would basically do this and more. I know of several for Java and C#, but I'm not familiar with any for C++.

tvanfosson
Agreed and we have a version of this in our code but not so extensive that it would support ad-hoc query-like functionality. Unfortunately, time to implement something like that in C++ would be prohibitive.
Karim
+1  A: 

All of your options can be reduced to

Worst option: have the code making the query check the database type.

It's just a matter of where you're putting the logic to check the database type.

The option that I've seen work best in practice is

Better option: Create query request method on the db connection object that takes a unique query "code" and returns the appropriate query based on the database engine in use.

In my experience it is much easier to test queries independently from the rest of your code. It gets a lot harder if you have objects that are piecing together queries from bits of syntax, because then you have to test the query-creation code and the query itself.

If you pull all of your SQL out into separate files that are written and maintained by hand, you can have someone who is an expert in SQL write them (you can still automate the testing of these queries). If you try to write query-generating functions you'll essentially have a C++ expert writing SQL.

Bill the Lizard
Agreed. In the end, the query has to come from somewhere. And your point about the difficulty of debugging run-time generated query strings is a very good one.
Karim
That's a good point -- auto-generating sql can lead to poorly optimized queries. ORMs suffer from that. But you don't want to write a special thing every little select a from b, either. There must be a middle ground - perhaps both approaches, auto- being the default, with ability to override.
SquareCog
@Dmitriy: I would definitely use an ORM until it proved to be poorly optimized. For when I roll my own, I usually try to write the most generic SQL possible, then customize it for different DB vendors.
Bill the Lizard
+1  A: 

Choose an ORM, and start mapping.

If you are to support more than one DB, your problem is only going to get worse.
And just think of DB that are comming - cloud dbs with no (or close to no) SQL, and Object databases.

dmajkic
An ORM would be good except for two problems 1) The only one I know of for C++ is DTL which hasn't been updated in 3 years and 2) We have potential for very heavy load and optimizations are absolutely critical. I'm concerned that an ORM would be restrictive in the optimization department.
Karim
+1  A: 

Take your queries outside the code - put them in the DB or in a resource file and allow overrides for different database engines.

If you use SPs it's potentially even easier, since the SPs abstract away your database differences.

Cade Roux