views:

2589

answers:

26

The good old Relational Database Management System (RDBMS) has been around for quite some time now and is still, certainly in my opinion, the mainstay of the majority of production platforms/software applications.

Recently there seems to be a great deal of hype in the community regarding relatively young database technologies such as Cloud Services (SQL Azure, Amazon S3 etc.) and how virtualization is changing the way we view/work with database technology and much more.

We are clearly at a time of pushing forward, looking for ways to innovate and improve our use of database technology, in a big way.

What do YOU think the future holds for database technology in general and what do you see as some of the potential obstacles we face in the coming years?

EDIT: Granted, there may be no “right” answer to this question, however I will be providing a bounty and selecting a “best” answer based on the quality of the arguments/views/thoughts presented.

Looking forward to your answers!

+11  A: 

Reputation bait edit:

Me and another developer at my work had a whiteboard with a 'Technologies that should die' section we'd add to. Mime email was on there, as was SQL. The problem with entrenched technologies such as these is it's hard to see a world without them - it's easy to see a world without them, but it's hard to visualise a world with something genuinely better.

SQL is one such relic. It was originally designed to be hand typed into a database using queries designed by a human - hence the pseudo-english structure of it. Machines were not meant to make SQL commands, it's just that now that programming, especially with the web, has a high reliance on databases and SQL is what is the standard interface for databases is.

One thing I am sure of is that getting a programming language to make a human-language string which is passed to a database which then has to decode it back into a computer-language format to understand it is both inefficient and hazardous. Mixing data and commands is something you don't want to do as it is asking for issues. Even modern CPU's contain an NX bit to flag areas of memory as data so that they will not be accidentally run - causing either a crash or an exploit. SQL injection issues are rife, and are symptom not of (only) bad programmers, but of a bad technology. We have many high level languages (above C) so that buffer overflows are no longer an issue, but are stuck with a C level database language designed in the early 70's.

You just need to look at any semi-experienced PHP (or any language) programmer's code. No doubt they all use some form of SQL abstraction library to take the perils and complication away from using it. For example, if I want to pull the contents of a row I'll use something like the following:

$row = CMS_SQL::get_row ('pets_table', 'bob', 'name');

Which will pull the row from pets_table with the name of 'bob' - assuming name is unique in the schema. Using true SQL you'd have to do...

$name = mysql_real_escape_string ($name);
$result = mysql_query ("SELECT * FROM `pets_table` WHERE `name`='{$name}' LIMIT 1");
$row = mysql_fetch_assoc ($result);

Which is pretty horrible, especially if you have to repeat the same thing a lot. Also let's take updating rows. The code below uses my abstracted SQL class to change the cat name from bob to bobby.

$row['name'] = 'bobby'; # Change pet name to bobby
CMS_SQL::update_array ($row, 'pets_table');

Note: update_array uses the primary key id if it's available to track the row, which is why I don't seem to specify it. The SQL for the above is much worse.

And finally, get a single value from a database:

$cat_colour = CMS_SQL::get_element ('pets_table', 'colour', 'bob', 'name');

And in (unsafe) SQL:

$sql = "SELECT `colour` FROM `pets_table` WHERE `name`='bob' LIMIT 1";
$result = mysql_query ($sql);
if (mysql_num_rows ($result) > 0)
{
    list ($cat_colour) = mysql_fetch_array ($result);
}
else
{
    $cat_colour = false;
}

As I said it's very hard to envisage a replacement technology for anything - anything new usually has some good ideas that just don't work, and obvious ideas that are somehow missed, but I really do see an eventual abandonment of SQL and a move towards treating databases as if they were actually physical resources - with php for example possibly treating the whole thing as a native associative array. I also see references being treated as pointers transparently: e.g. if you're making a comment system and have a table called 'posts' and 'users' and you have a poster column on the table posts which references an unique entry in users then the the column posts.poster would be the correct row on users. Updating it would update the row in users. No joins, sub-selects or any nonsense.

With the whole thing being treated as a native data structure it would let you use either a special language to be run the database server, or even native code if the compiler was especially smart, which would let you use native conditionals rather than WHERE, as the whole WHERE var='val' AND var2='val2' is programming conditionals (such as if's) just represented in human-readable SQL.

It'll probably be an open source project such as PostgreSQL or MySQL or some other database which will provide a native class for dealing with data in a non-SQL format for a web-language such as Ruby, Python, PHP etc. which will bring the ease of dealing with native data and the power of a dedicated SQL server into one. Once there is a standardized, stable, native alternative I can forsee it replacing SQL for 99% of the low-end data storage platforms, with SQL relegated to high-performance ultra-huge databases and legacy systems (before eventually being entirely supplanted).

The reason SQL is so pervasive is that people think in terms of SQL when doing things - as soon as something comes along which allows people to mentally visualise data storage and queries in other terms I think it's days are numbered.

Edit: wow, I didn't realise how much I hated SQL until this point :)

Meep3D
'Having to make queries in a language designed to be typed is just archaic' not sure i get why this is the case.
John Nicholas
Do you mean "typed" as in "strongly-typed language" or "typed" as in "you have to type on a keyboard"?
MusiGenesis
Type as in a keyboard. SQL injection should not be an issue, no other system I am aware of doesn't make the distinction between user input and commands.
Meep3D
Um, is there a database out there that can be spoken to?
MusiGenesis
I think I get what you mean - SQL was originally written with the intent of business professionals typing natural(ish) language commands into the computer and getting results back. Admittedly, this is hard to program with in many cases, considering the proliferation of SQL exploits.
sheepsimulator
I think a different query mechanism is just not in the cards - it would have happened in the last 10 years. None of the big RDBMS vendors even offered a proprietary one. But the answer is decent enough, +1 to offset the downvotes.
Yishai
I get your point now, too, and it's worth an upvote (I didn't downvote you). However, SQL allows not just humans but also any programming language/platform to interact with the data in the same fundamental way. In a scenario where you only have a program and some data, the SQL layer seems (and really is) superfluous, but in most of the environments I've worked in, it's an essential compatibility layer.
MusiGenesis
1 the non open-source version of that database is already there, Gemstone (see GLASS)
Stephan Eggermont
-1: SQL has many problems, but nothing mentioned here is an actual problem with SQL.
RBarryYoung
RBarry - so SQL injection exploits are not a problem with SQL, as is having to generate unnecessary human language strings leading to a myriad of confusing, incompatible workaround classes to tackle the problems with SQL is not a problem with SQL?
Meep3D
+3  A: 

databases aren't going anywhere. just look at the so called object databases. they try to be the hype and all that but in the end they end up being quasi-relational since people see that they need key constarints and other db constructs. so they reinvent the wheel.

as for the cloud the database will still be there. just abstracted. for some people it will make sense to migrate for some it won't. only time will tell.

most advances are going to be in underlying storage and access technology like disks and ram. just look at the new Fusion-IO system that uses SSD drives for databases. it flies.

another useful thing might be Column-oriented DBMS that outperfrom current databases in some areas. my guess is that in the future the sql server db engine will be able to use botw row and column oriented storage system depending on what you are doing.

as for developer mindset that db's are only good for storage and nothing else: programing languages come and go. databases and sql stay.

then there an issue of security (sql injection) that you cann't really blame on the db. every tech has it's flaws and security risks. you just have to know about them.

Mladen Prajdic
-1 Clueless about object databases. Take a look at Gemstone
Stephan Eggermont
you do know that "clueless" isn't really an argument, right?what about gemstone? what's your point?
Mladen Prajdic
There is nothing quasi-relational about Gemstone, and nobody needs key constraints or other db constructs. Just smalltalk, a persistent image, versioned objects and transaction boundaries.
Stephan Eggermont
duffymo
I visited the Gemstone site, doesn't look bad but it has no search box.
tuinstoel
+15  A: 

Relational databases and SQL will be around for a very long time - if there are some of you as old as me, then you may remember it took many of the institutions such as banks years to even move to new fangled relational databases, from their mainframe hierarchical databases.

What I think is improving all the time is the way in which these databases are accessed, and the way it is getting easier to code against them.

There is no doubt that there is a place for all sorts of alternatives, but I would be amazed if the likes of SQL Server, Oracle, MySQL and DB2 loose much market share to other services in the next 10 years.

Miles D
Many institutions such as banks still have not completely moved to relational databases and continue to use 1960s style hierarchical DBs for some applications.
Michael Borgwardt
-1 Their market share for new development should drop to near zero in 10 years unless they introduce much better products. The OR gap is such a productivity killer that they can no longer compete for new development
Stephan Eggermont
AHEM - AS/400 from the very start had a realtional database as its ONLY file system. Every thing was stored in thier own implementation of DB2.
James Anderson
I've removed the mention of AS/400 - sorry for the incorrect info.
Miles D
+4  A: 

I think the problem we will have is nothing to do with technology. We have become pretty good at storing and retrieving information. The problem lies in the way we interpret the data. There is simply too much of it. Many, many people (myself included) just have no idea what to do with all this wonderful information we are collecting, or how to harness it to its true potential. The mindset needs to change from "We need a database" to "We need to do something worthwhile with our database".

DanDan
More specifically, it becomes a question of "data pollution" - too much of it, and its too late to go through it all and get rid of most of it that we don't need...
AviD
+5  A: 

SQL will be around for a long time but the next generation of databases will do away with pre-imposed structure and indexes and things like that. You will be able to pour any kind of data into it and then retrofit a structure (or the other way around if you insist).

The database will be more like a giant file system which learns how it is being used. They will sport full text search, smart handling of binary data (so you can insert JPG images and they will be scanned for EXIF info without any manual intervention).

New programming languages will blur the line between the internal data model of the application and the database.

Thinks like clustering, partitioning, etc. will still happen but without manual intervention. Creating a cluster will happen by installing a shell on some computer and then register that computer on the main system.

An alpha version of this already exists: It's called "Internet". So far, installation and handling is well below par but in a few years, things will have improved greatly.

Aaron Digulla
I totally agree with this comment! I think databases should become more automagical. I actually posted a question about why do we have to care about data types because they seem to be part implementation detail cum part semantic definition. I'm heavily in favor of abolishing (or at least adding support for implicit versions of) things like indexes and data types.
Mark Canlas
+3  A: 

Take a look at Drizzle. It's a rethink of the traditional relational database, that throws away the old stuff we no longer need and embraces the changing environment, such as multi core cpu's and replication for cloud computing.

There is an excellent presentation http://blip.tv/file/2296093 by Brian Aker

FlappySocks
Interesting, I was not aware of this project until now.
John Sansom
+2  A: 

I think rdbms will be around for a long time SQL Azure is still an RDBMS it is just on a remote computer instead. Two benefits of this are general maintenance is external to your company and utilization of resources is increased in a simila way to how terminal services maximizes use of resources.

There are however alternatives such as object-orientated databases these are popular in some circumstances. It really in the end depends on what information you would want to get back out from the system. Object oriented databases are generally easy for programmers to work with as they are integrated with the programming language including their basic types so you get type casting and validation as part of your system.

PeteT
+1  A: 

I think the next big problem for RDBMS to solve is the cloud "scale-on-demand" model. Right now that doesn't really work well in the RDBMS world, so we are seeing a resurgence of hashmap and hierarchical database structures (these were around a long time ago, but they are more suited to the "scale on demand" model).

The object/relational disconnect was never solved, and RDBMS won (as in it didn't change and it wasn't replaced). I don't think the dynamics that made this work (inertia combined with the fact that many projects just aren't that object oriented and RDBMS represented a lowest common denominator among many different technologies accessing the same data) will be repeated here.

The "scale-on-demand" requirement however is not something that RDBMS can survive without adapting to. My prediction is that the core of RDBMS will live on, and developers will be optimizing queries and data structures in a way that the RDBMS can efficiently scale across many different machines on demand as the data grows. In other words the RDBMS will change to make scaling simpler, but it will require that the queries be written (and perhaps the equivalent of indexes across the federated database to localize relevant data into one memory location) and tweeked to allow fast execution in such an environment, much like SQL queries today are optimized to take advantage of indexes and other performance considerations.

Yishai
A: 

The RDBMS is going away or transforming to something useful, but it will take a lot of time to disappear (like all legacy technologies). It is getting squeezed between prevayler-like solutions, object databases, and the cloud.

Prevayler provides ACID persistency for main-memory systems with flash backup, meaning no object-relational gap for systems to 16GB data and a few hundred transactions/second. [edit] I just checked the Dell site. Make that 96 GB of data. A prevayler-like system can support a lot more users than a RDBMS-based system because it can be zero-copy.

Object databases provide complex constraints, object (including behaviour) versioning and therefore much better data quality.

In the cloud the relational model doesn't scale. And remember, scaling only starts at 96 GB of data.

There is an awful lot of legacy using SQL, and a lot of people who won't be educated and will retire before the RDBMS is gone, but developing without a RDBMS is simply faster.

Another thing SQL databases are very bad at is handling time/historical data. Datawarehouses use column storage, an object database or a cloud.

Stephan Eggermont
A column store can still be a rdbms. Vertica is a column store and it is SQL-based and it is relational. It just has a different storage model than for example MS SQL.
Theo
Same for Sybase IQ, read the answer of hythlodayr.
Theo
They both still have the same data quality problems. We need a better language than SQL.
Stephan Eggermont
+23  A: 

The question is a bit general. Database technology is being used in quite disparate applications, from handling production work flows and banking, to web apps and online poker hand tracking software.

That said, a lot of the recent innovation in databases seems to be centered around the web. As the web consumes more and more of the world, there arises a need to store and retrieve greater amounts of data for web applications. There's tremendous pressure to improve the throughput and scalability of database technologies that are powering popular web applications.

The pressure is greater given that developers demand databases that are "Harder, Better, Faster, Stronger", to quote Daft Punk, and they want them cheap and easy to work with. Luck just has it that some of these developers are exactly the types of people who can develop these technologies, and they're doing so.

This is the reason we're seeing a lot of interest in key/value stores now, like the blazingly fast Tokyo Cabinet. But those aren't exactly so innovative, and are quite a pain to work with. Usually they're not even suitable for the task you might have in mind. But there has been recent advancement around these types of databases.

Redis, for example, adds a lot of functionality that isn't available in traditional key/value stores, making it much easier to work with. MongoDB, a document-oriented database somewhat similar to CouchDB, has even more functionality while retaining the benefits of speed and scalability, and looks like an extremely promising project.

At the extreme end, we're also seeing very scalable and fault-tolerant distributed systems being created, like Voldemort which are able to function even if some parts of the system are disrupted. There are more like that which I can't remember off the top of my head.

But the general idea is that BASE is taking the place of ACID. Many applications, especially on the web, don't really need their database to strictly adhere to ACID principles (unlike early database-backed software, which was often "mission critical"), and allowing a little harmless data inconsistency can go a long way into make databases more flexible.

Now don't get me wrong: RDBMS is definitely going to stick around a long time. It's almost always the right choice for the average website back-end, and it works. And there will always be a need for ACID. But relational databases will no longer be the one-size-fits-all solution it has always been.

In the future, I think what we'll see is the emergence of a great amount of specialization in databases. There will be diversity in the types of databases available, but each will be specialized in its own way, so you can find one that really suits your needs. It's quite exciting to see how the field is progressing, who would've thought it would be such an active area of development? Again, I believe this has been brought about by the needs created by the world wide web.

You may also find these two links interesting:
http://www.metabrew.com/article/anti-rdbms-a-list-of-distributed-key-value-stores/
http://delicious.com/ehsanul_g3/database

ehsanul
Excellent answer, including good references, well laid out points and a nice conclusion.
John Sansom
+3  A: 

The future of databases will go wherever Google takes us :^)

Seriously, the core of database systems probably won't change much, but what will change is how we interact and interface with them. Database systems might appear to be different or new, but deep down they'll remain pretty much the same.

Database systems exist on top of another database themselves...the file system. File systems have slowly evolved, but are still relatively unchanged deep down inside. And the file system itself sits on another "database" - the sectors and the read/write/seek methods of the hard disk.

I think the real question is when will a new layer of abstraction usurp the current (and I'd agree - archaic) standards in Database Management Systems. Whatever it is probably won't replace the current technologies, but instead sit on top of them. People will less and less work directly with [insert your favorite DB here], but instead work in the layer of abstraction that sits on top of that system. Hibernate, LINQ, GQL, and other technologies are coming around as new standards with how we interface with databases, mostly irregardless of what the underlying DB technology is. But what about the next layer of abstraction in database management?

The real obstacles in this question are that businesses hold database systems as sacred cows that cannot be changed/touched. This in turn prevents the companies that hold the most market share in DB systems from really evolving (MS, Oracle, etc.). So...Google...I'm waiting for your management layer to sit on top of these sacred cows!

tyriker
http://en.wikipedia.org/wiki/Irregardless
Rich Seller
+4  A: 

I think the future will bring us the very first lasting relational DBMS in IT history.

+2  A: 
  1. Look at this site. It works quite well! It doesn't use things as a distributed key-value-store like Tokyo Cabinet but RDBMS MS SQL and scale up instead of scale out. It is still possible! NoSQL is quite hard. Read also here: http://highscalability.com/stack-overflow-architecture

  2. People say that specialized databases are so much better. Well Team System is better than Visual Source Safe (no transactions mean corrupt data) and Team System uses MS SQL.

  3. Mobile phone devs now learn SQL because they want to use Sqlite. No more flat files, Sqlite is so much more feature rich than flat files, I can't understand why people want to use flat files. Sqlite is also very easy to deploy.

  4. Using one database for all your data has big advantages. In my previous job we used to store our relational data in Oracle and our spatial data in ESRI. After Oracle improved its spatial abilities we moved our spatial data to Oracle. No more ESRI. ESRI has probably much better spatial abilities than Oracle but having the possibility to store relational data and spatial data in one table makes life so much easier. You can query, index, join, modify and back up spatial and relational data together with SQL statements.

  5. Look at the popularity of Wikipedia, wikipedia uses MySQL. Companies download the wiki-software to start their own companypedia. Sharepoint is also becoming more popular and it uses MS SQL.

  6. There are certainly cases where a RDBMS can't deliver because of scalability issues. There is however a new approach for data warehousing. HadoopDB combines the pros of Hadoop and the pros of an RDBMS. Read here: http://dbmsmusings.blogspot.com/2009/07/announcing-release-of-hadoopdb-longer.html .

tuinstoel
-1 "The bottleneck is the database 90% of the time." Version management is one of the problems RDBMS's are really bad at, but Team System shows you can brute force a lot. 20 years ago nobody would have used a RDBMS for that.
Stephan Eggermont
@Stephan Eggermont, A. So visual source safe is better than Team System? My code has often been corrupted by VSS. B. 20 years ago nobody would use a rdbms to store and index spatial data, now people do it and it works quite well. C. How do you know that the database is the bottleneck 90% of the time? Not on my projects.
tuinstoel
"No more flat files, Sqlite is so much more feature rich that I can't understand why people want to use them" I assume you meant don't want to use them.
the_drow
@the_drow, Your assumptation is wrong. "Sqlite is so much more feature rich that I can't understand why people want to use flat files". Them refers to flat files.
Theo
Nobody should be using VSS. There have always been better solutions. Stackoverflow claims the database is the bottleneck
Stephan Eggermont
@Stephan Eggermont. This site works fine, doesn't it? VSS is just an example that specialized databases are not always great.
Theo
So what? Everyone can make a database that loses code. The point is not that the site works fine, it is that it could be better (for the developers and maintainers)
Stephan Eggermont
@Stephan Eggermont. Please build your own stackoverflow site.
tuinstoel
+2  A: 

An expletive cartoon on this issue: http://browsertoolkit.com/fault-tolerance.png

This is not my point of view but it shows a little bit of the "Zeitgeist" in the dev world.

tuinstoel
+2  A: 

There is no argument MapReduce/ Hadoop is and has already proven itself to be a highly scalable & fault-tolerant mechanism over the cloud for data intensive operations; to compute, to aggregate; at the end of the day it’s same old distributed computing via grid jobs that’s showing it’s magic.

Whole NoSQL moment and everything related to it isn't about shedding everything existing and go new way on, it's about making people aware, to let people out of local maxima and help them see the world beyond which is to realize "There is a more than one way to do it" (Perl mantra), and there always is.

Sticking always to a traditional approaches or systems to do some next generation or a different sort of job isn't surely a way to go, we got to change the solution space when problem space changes.

I think future is about hybrid technology, when it wouldn't even be require to be called hybrid anyways... We already see combinations of technology working complementary with each other: when Bradford's mentioned: [Hadoop + Hbase + Hive] thing is in work, when Facebook’s [Hadoop + Casandra + Hive] is in work, when Linkedin’s [Hadoop + Voldemort + RDBMS (Oracle, MySQL)] is in work.

Line between traditional RDBMS & noSQL systems is already blurring, as we see HadoopDB combining power of [Hadoop + RDBMS (postgresql) + Hive] to cater the job. MongoDB and Yahoo Sherpa both are working to provide a scalable data storage system with as many friendly querying capabilities as possible. (Reference: http://developer.yahoo.net/blog/archives/2009/06/nosql_meetup.html)

Very soon I believe big vendors like Oracle are also going to introduce such parallel DBMS, with hybrid combination of some of these noSQL system approaches in the backend, as other closed source data warehousing vendors GreenPlum and Aster Data did.

Future is of parallel DBMS functioning as a one package, where we don’t need to worry about integration of components making it work. Still when that’ll be here, one solution of course wouldn’t cater all problems, all problems will evolve with our solutions too; we would adapt and should continue picking up the hat that (most closely) fits the head.

+12  A: 

The success of relational databases (and they are everywhere today; it's hard to find a business without one, or even tens or hundreds of them) is based in the original insight that key early thinkers had WAY back:

  1. Businesses need their data to be easily accessible (duh) not buried inside a proprietary, hierarchical data store where it requires a full-on program and developer just to interrogate the data. Those early engineers knew this from hard experience. The first insight was "flattening" the data into tables. It seems old fashioned now, but one component to the success of RDBMS was the fact that it's NOT hierarchical, and it's simple to query and recombine your data any way you need to. In fact, those early RDBMS guys worked incredibly hard, with intention, to remove hierarchy/nesting from data storage. Object graphs tend (tend) to be fairly fixed in a typical OO program, and if that fixity carries through to the storage layout you might add convenience for the developer, but you just create hassle and expense in the ad-hoc use of that data.
  2. Following on that notion, businesses needed a straightforward and standardized way to access (query!) their data. When none existed, SQL was a revelation. It's hard to imagine how revolutionary that is when we take it for granted now. Sure SQL is rusty and tired, but stop and really imagine what your life would be like coding against thousands of different nested, maybe binary file storage systems in a world with no SQL.

So, data freed from hierarchy plus a common access method/query language = killer business app. Those two tenets still hold. I agree with others' posts about some of the shortcomings of SQL and RDBMS today -- coding C/++/#/java with text-based queries to hand to the database engine is silly and anachronistic. I would like finally to have a means to directly pass a query "tree" as an object-friendly data structure to/from the DBMS instead of query text. (MS had this for a while but abandoned it.) But in IT we are, ironically, the prisoners of whatever was successful in the recent past.

I firmly believe that nothing will supplant the RDBMS until another system fulfills these requirements AND gets a large part of the market: 1. A data store that you can freely ad-hoc query and modify without regard to or need for an end application. Businesses want to see, use, and interrogate their data irrespective of app. 2. A standardized interface for that activity that is straightforward to learn and use for someone like a business analyst. 3. ACID.

If we got those things from an object database, and it was really faster, then it would certainly compete. Nobody cares too much, outside this circle of geeks :-), how the bits get moved around under the hood.

I think the past will influence the future of RDBMS's. They will soldier on, even if they are not glamorous, because they are still the only common solution that has these three properties. I do think we'll see innovation in the interface between the RDBMS and apps; SQL might even get retired from the the realm of programs communicating with data engines, and good riddance. ORM, I hope, will become less of a band-aid to automate writing SQL and more of a replacement for SQL. Techniques for scaling out instead of up will probably mature, and they seem promising.

onupdatecascade
-1 The consequence of 1 is terrible data quality. It keeps millions of programmers doing useless work.
Stephan Eggermont
+1 I agree (by the way db constraints enhance the data quality). Data should be open for the customer, the customer is the owner of his/her data. Not the application or the developers of the application. It is very weird that amazon can remove a book from a kindle. The data on the kindle should not be owned by amazon.
tuinstoel
@Stephan - it's true that opening access to the data allows people to make a mess, if they don't know what they are doing. Constraints help, some, but are not utilized enough. Still, I think the benefits of providing access to the data by qualified people far outweigh the risk of granting access to ... erm ... others. Plus, a ham-fisted person can make a mess in any environment.
onupdatecascade
There is no data without context, i.e. application. We learned twenty years ago that you need to keep data and behaviour together. There is no such thing as access by qualified people in a rdbms in real life. Someone has to fix the data problems. Not that management is aware of that, b.t.w.
Stephan Eggermont
Stephan, your argument ignores the daily reality of numerous businesses: sometimes you need to build completely new behaviour on top of existing data. Doing so is much harder if you are forced to restructure the data, or build increasingly complex logic around the existing structure. And you seem to be completely misreading point 1: providing a standardized interface to a very simple structure allows business experts to *interrogate* (note I did not say alter) the data without needing to involve the programmers at all. This *frees* millions of programmers from doing useless work.
Zac Thompson
Stephan, I'd recommend reading The Third Manifesto, which talks about using database technology and logic programming to keep data and behavior together. Or else, put your comments on a blog somewhere and fully develop them.
+7  A: 

I think the future is native object storage. If you look at the work Microsoft is doing with the Entity Framework, the logical evolution is to move the entity mapping into the database. By storing objects in their native format, the need for Object/Relational mapping would be eliminated.

Entity SQL (eSQL: http://msdn.microsoft.com/en-us/library/bb387118.aspx) is already pointing the way.

Shane Cusson
Take a look at Gemstone to see where SQL Server is headed. 30 years later...
Stephan Eggermont
I disagree with this - read my post below. There are no "pure object storages" - they differ only by their high-level API; their low-level components are almost completely the same as in relational databases: there are indexes, query translator, query plan optimizer, evaluator and so on.
Alex Yakunin
And about Entity SQL: it's nothing more than legacy API. Likely, it appeared when they were developing famous long-awaited ObjectSpaces, but become obsolete before supposed release - because another (btw, much more competent) team there brought LINQ. So likely they decided to postpone the release to implement LINQ at least. And as you see now, this is not a clueless implementation: they still use eSQL as backbone there ;) But I beat it will finally share the destiny of LINQ to SQL, i.e. it will become obsolete in few years. Who uses it? Why I should use it, if there is LINQ?
Alex Yakunin
@Stephan - Wild stuff. Found it here (gemstone.com/products/smalltalk)
Shane Cusson
@Alex - I was implying the future is a move towards pure object databases, away from relational. People are trying to drop relational in one jump, like db4o (db4o.com, great product by the way) but I think it'll be evolution not revolution. So the Entity Framework, MS's OR/M that uses EntitySQL, is an evolutionary step towards pure OO databases.
Shane Cusson
So what is "pure" OODB? :) A DB with OO API (like LINQ) as the primary one? Than it's just an API requirement - i.e. exactly what I'm talking about. Or "OO" implies something else, like "storing the objects natively"? If you don't know, this is pure marketing instead of pure OODB ;)
Alex Yakunin
+1  A: 

Relational databases aren't going away, barring some brilliant innovation (quantum computing?) or a fundamental shift in IT architecture. Calling it a legacy technology misses the pictures, given that the design & principles are very solid and are STILL extremely relevant.

Right now, if there's a flaw with the modern RDBMS it's that most provide "everything".

  • Full set of ACID properties, yet tunable.
  • Reasonably good reporting capabilities.
  • General-purpose.
  • Third generation programming language, which is auxilliary in my view (triggers not withstanding).

The cost? Quite decent at "everything" but (sometimes) not good enough for certain, very important needs. Especially when you're pushing boundaries. In this vein, I think database vendors will have to offer specialized engines for different tasks.

Sybase, for instance, offers two different RDBMS: Sybase ASE, which is their normal transactional database. And then Sybase IQ, which is their data warehouse.

Sybase IQ has created a name for itself as being very, very good data warehouse. The transaction speed is sub-par but SELECT queries with huge #s of joins are blazingly quick, because of a very different approach on how records are stored.

I've also seen proprietary databases which are specialized for high-speed transactions. They can handle tens of thousands of transaction without a hitch and still provide the full set of ACID properties by sacrificing generality.

hythlodayr
You are missing the fundamental shift in architecture. Lots of ram and flash storage means no need for a RDBMS.
Stephan Eggermont
No way. Applications where that's true probably shouldn't have been using a general-purpose RDBMS in the first place. Things like transactions and relational integrity are extremely important for many applications.
hythlodayr
+5  A: 

Perhaps the best answer to your question would be to look at the spread of sqlite.

SQLITE has enabeled relational technoligy to scale down. It is now running on a remarakable number of small devices (symbian mobile phones, any computer with firefox installed etc. etc.).

There are probably more instances of sqlite deployed on a single obscure platform (android anyone) than all the Object database deployments ever.

Not bad for dead technoligy.

James Anderson
+1  A: 

RAM shared between machines, so you don't need a DB as long as some of your cluster stays up or persistent RAM could make the disk based DB obsolete. Would need to be cheap though...

mcintyre321
Like, $60 a month or so? Just take an abbo on three different virtual private servers.
Stephan Eggermont
+1  A: 

Probably the data will become more distributed and the SQL language will be superseded by a more powerful more natural language.

The focus will be on major parallelism and data warehousing with capacity for storage like we've never seen before.

We may see storage technology and database engines merge for increased performance. We're on the verge on a breakthrough in storage technology that will see a big jump in capacity like we haven't seen since the floppy disk to the hard drive.

In the mean time, current database technology and the SQL will remain for several more years yet.

Matt H
+1  A: 

I'm not sure if my point of view applies directly to "database technology", but I would like to see the departure of specialized database languages such as T-SQL, PL/SQL, stored procs, etc. etc. as I purely see this technology as the major reason why most companies don't move from RDBMS 'A' to RDBMS 'B'. While from a sales point, these types of technologies are meant to be a positive benefit for the customer, they ultimately become a lock-in to the vendor.

If all database work had no stored procedures or specialized code for that specific product, then moving from database product 'A' to product 'B' would be much more affordable and feasible. That's not to say T-SQL, PL/SQL and others aren't technically good or bad. They all have their own merits. But I see these database languages more often as a wedge to keep the vendor in business and the customer paying rather than really innovating newer and possibly better solutions than what's been done in the past.

In addition, most employees who're skilled in these specialized languages also see no problem as such because their marketablitily is in parallel with the vendors offerings. So, as long as companies want PL/SQL developers because they're forced to keep using the same product, people will flock to work for them and keep working on these locked in technologies.

Of course, no one can predict the future and maybe 10-15 years from now, PL/SQL will possibly gone in favor of some other favorite flavored language of the decade. It's hard to say. Will these procedural database languages become the new-age COBOL of our generation? I don't know. Maybe I just feel a little strange every time I write a T-SQL proc and wonder "will this procedure still be worked on after my lifetime?".

osij2is
C# is very good but it binds you to the Microsoft OS (who uses mono?), there is almost always some kind of lock in.
tuinstoel
C# isn't 100% locked in though. It's approved by Ecma (ECMA-334) and ISO (ISO/IEC 23270). But to your point, the BCL of .NET framework IS proprietary but the language is not. So in terms of languages, we (as developers) have a lot more freedom because interoperability between languages and platforms is crucial to customers and thus vendors bend on this issue. Hell, Microsoft now supports PHP on IIS7. Languages are no longer the barrier they *used* to be. But databases are a different matter. Vendors (I think) realized that data is more important then the means by which you deliver it.
osij2is
I should have written .net framework instead of C#.
tuinstoel
+6  A: 

Current trends:

  • LINQ becomes more and more popular. It makes low-level query API (e.g. SQL) much less important.
  • LINQ are queries operating with entities rather than table rows, so its support imply there is either a kind OR/M on the client, or we deal with pure object database (OODB).
  • BLL tends to migrate to a dedicated part of an application out of the database. Stored procedures & in-database logic is used less frequently. And LINQ plays a significant part here.

Consequences:

  1. One of the strongest requirements to future databases is LINQ support in their client-side software.
  2. As a consequence of 1, OR/M frameworks will become much more popular (earlier I wrote LINQ implies there is some OR/M).
  3. Low-level query API / language of future data storages may differ from SQL - again, because there is LINQ, which is much more convenient.
  4. Storages become less and less complex. Industry leaders went out from complex monsters supporting "everything" (e.g. Oracle) to their own implementations, which are relatively simple. Both internally, and from the point of API. I mean Google BigTable and Amazon SimpleDB. Azure Table Services show that Microsoft also tested this direction first, but now they don't offer just this option: they added SQL Data Services on Azure platform. Obvious decision, since their clients used to TSQL.
  5. As a consequence of 4, low-level query API may not support execution of any complex logic - earlier it was attractive, since (T)SQL played a role of BLL language as well. Now implementation of this logic on LINQ & .NET is the mainstream direction.

So I feel that all the storage platform components will be less and less coupled. There are:

  1. Low-level storage engines. Likely, w/o SQL & stored procs, but with their simple query language focused purely on queries.
  2. OR/M frameworks acting much more like clients+DALs for them. These OR/Ms must support LINQ as their primary query API. Support of large number of storage platforms is one more aspect they must address.
  3. Business logic layer using a particular OR/M. I feel targeting multiple OR/Ms here is much less attractive option, since what people want is storage platform transparency, but this is already achievable because of 2.
  4. At this point all further layers are fully abstracted from the storage, so discussing them here is out of the subject of this question ;)

If you're interested, there is more detailed article explaining this vision. Here it is.

Alex Yakunin
One more proof that BLL is moving to high-level languages out of databases is availability of Software Transactional Memory (STM) implementation in .NET 4.0. Note that this is really rather complex component requiring JIT compiler-level integration (or modification).
Alex Yakunin
+1  A: 

I believe that future databases will implement some kind of Reasoning Engine, like Prolog does.

Here is a good and interesting (IMO) article, Databases - A New Frontier

Two fragments from the above link,

[...]

Prolog is traditionally described as "a logic language". This definition has a plethora of interpretations and if you've never programmed in Prolog it'll likely confuse you more than help you. So, for a quick introduction to Prolog, it helps to think of it as a combination of three things:

  1. A database defined with a set of relations
  2. A language to query this database
  3. A curious mathematical property that makes Prolog a general purpose programming language

[...]

Note that the reasoning engine behind Prolog gives us more expressive power than SQL. In SQL we can specify the kind of data we want, and let the RDBMS figure out how to obtain it. However, if we want to obtain any information from our data that isn't explicit in its representation, we must code it up in a more traditional, procedural (or functional, if the flavor of SQL is any good) style. With Prolog, we can request complex reasoning to be performed in an entirely declarative style, and let Prolog figure out how to obtain the results that we need.

Nick D
prolog or the semantic web or something else? What will be the future?
tuinstoel
+2  A: 

The future will be RDBMS but with more features such as OO, column-oriented, whatever. Consider RDBMS as cockroaches: you'll never get rid of them and they'll keep coming back stronger.

One point often forgotten is that RDBMS are based on proven mathematical techniques: set theory etc. You can express most simpler SQL statements as Venn diagrams that I was learning at school before my voice broke.

What is your client language based on?

gbn
A: 

I would check out this video, it has some interesting comments, and is all about thinking what you want from your persistence layer.

http://blip.tv/file/1949416

I personally dont think the RDBMS is going to go away, but it's use is going to change, and i think we'll see a lot more support for concepts like "eventually consistent" in attempts to increase performance.

Codek