views:

303

answers:

5

I know, relational database is a database where fields in one table are linking to rows in the others, something like this.

But I can't understand what does it mean for me as a web developer!

As I know, a query with joins and nested select can reduce perfomance (especially drupal-style queries with dozens of joins). Even more, any queries to DB are bottlenecks, and then you have lots of users you have to cache every select request.

If you cache every select request, it's better to cache simple requests rather than complicated. You can either cache "select * from tbl1 where id = 123" and "select * from tbl2 where id = 456" or "select * from tbl1, tbl2 where ...", but if you choose the second way, you'll need to cache every combination of objects - it isn't cool.

Ok, now we use only very simple queries like "select * from tbl1 where id = 123" of "select id from tbl1 order by id limit 0, 30" and cache them (or we can cache only the first type of queries, whatever). There queries and not less simple INSERT, DELETE and UPDATE are all what we need and all what we use!

As we can see, all the relational logic are in the main language of the application, not in SQL. So, why do we need all this relational things? What do they mean? What do "relational" type has what another types hasn't but it is needed? If we don't use relational features, why do everyone still use MySQL or whatever relational databases, even if he care about the perfomance?

This type of databases has become a standard. Why? I have no clue. I've hardly ever heard about somebody using non-relational database, except for the on in GAE.

Am I missing something?

A: 

It allows you to normalize your data and remove redanancy. Rather than store all data in a flat table (like an excel spreadsheet), you store disparate data in separate tables and then relate them to one another.

For example, you can store users in a Users table and products in a Products table and then use a relationship table to relate which user ordered which products.

UserA -> ProductA

UserA -> ProductB

UserB -> ProductA

UserC -> ProductB

With normalized data, it means if data changes, it only needs to be updated in one place. If a user changes their name, only that user record changes. If a product price needs to be raised, only that product record changes. You don't have to scour your flat table looking for duplicate data to replace.

Soviut
what do you mean, the relational type of database is the only type that can has different tables?
valya
it's the only type where you would need to connect two different tables together somehow. I'm sure there are databases that have 0 relations, but I have a hard time thinking about a practical example.
GSto
as I know, BigTable is non-relational, but in GAE we can use tables and there is no problem to use a relations like these. except for joins, but joins are.. are not cool :)
valya
A: 

I'm confused by your question. How else to you propose you keep track of how various tables relate to each other?

for example, I have a list of cars, and a list of people, and I need to connect which person owns each car, so I have a car_ID column in the person database. How would propose keeping track of these relations

Also, you say that you want to cache 'all queries are bottlenecks' and you only want to cache 'simple' queries. However, I'm 90% sure that making multiple small queries will be more resource intensive than making several smaller queries. you also don't have to cache every combination, only the ones that actually exist. in my example, what's wrong with a query like this?

SELECT person.*, car.* from person left join on car where person.car_ID = car.ID
GSto
nothing wrong while database requests don't slow down your application. then they do, you'll have to cache, and it's simpler to cache simple requests
valya
A: 

Relational Databases have become the de-facto database for a number of reasons.

  1. Setting up primary, foreign and unique constraints enforces certain business rules at the lowest levels, helps ensure data integrety, and makes the database relationships easily understandable to just about any level of IT professional.

  2. A properly designed relational database is in fact faster behind the scenes for many process (not all).

  3. The querying of relational database is fairly quick to learn, and easy to do.

  4. Relational databases help limit data duplication, and from a data engineering standpoint, that is a wonderful thing.

and many others, but these are a few.

Jay
A: 

If you don't use relations, you need to store everything in a giant table with numerous number of columns. Or you can use a datacube (I think?)

David
no, no. I don't recommend to forget about tables or IDs. I'm asking, why do every web app, either simple or enormous, use a DB with lots of unnecessary slow features like seven joins in a query and complain about database speed?
valya
When you have several tables in your database, you need JOINs to relate the records in different tables. I guess we just haven't been able to figure a non RDBMS that is just as easy to design, maintain, add/modify/remove tables and columns as RDBMS?
David
+4  A: 

If you want to learn about what relational means, I recommend the book "SQL and Relational Theory" by C. J. Date.

Relational in this context doesn't refer to relationships. It refers to relations which are basically what tables are called in the mathematical theories that led to the relational model.

The reason that relational databases have become ubiquitous is that they are the most general-purpose solution for organizing data with minimum redundancy.

There are valid reasons to use non-relational solutions. They often solve specific tasks of data-management extremely well, but are weak in other areas. Whereas SQL and relational databases strike a compromise, solving a larger set of problems adequately, with fewer areas of weakness.

Other technologies currently available that are not based on the relational model are listed in "The Next-Gen Databases."

Bill Karwin