tags:

views:

26

answers:

2

I'm reading a book, where the author talks about fetching an row + all linked parent rows in one step. Like fetching an order + all it's items all at once. Okay, sounds nice, but really: I've never seen an possibility in SQL to ask for - lets say - one order + 100 items? How would this record set look like? Would I get 101 rows with merged fields of both the order and the item table, where 100 rows have a lot of NULL values for the order fields, while one row has a lot of NULL values for the item fields? Is that the way to go? Or is there something much cooler? I mean... I never heard of fetching arrays onto a field?

+1  A: 

This is normally done through a JOIN clause. This will not result in many NULL values, but many repeated values for the parent row.

Another option, if your database and programming language support it, it to return both result sets in one connection - one select for the parent row another for the related rows.

Oded
+2  A: 

A simple JOIN would do the trick:

SELECT     o.*
,          i.*
FROM       orders o
INNER JOIN order_items i
ON         o.id = i.order_id

The will return one row for each row in order_items. The returned rows consist of all fields from the orders table, and concatenated to that, all fields from the order_items table (quite literally, the records from the tables are joined, that is, they are combined by record concatenation)

So if orders has (id, order_date, customer_id) and order_items has (order_id, product_id, price) the result of the statement above will consist of records with (id, order_date, customer_id, order_id, product_id, price)

One thing you need to be aware of is that this approach breaks down whenever there are two distinct 'detail' tables for one 'master'. Let me explain.

In the orders/order_items example, orders is the master and order_items is the detail: each row in order_items belongs to, or is dependent on exactly one row in orders. The reverse is not true: one row in the orders table can have zero or more related rows in the order_items table. The join condition

ON o.id = i.order_id 

ensures that only related rows are combined and returned (leaving out the condition would retturn all possible combinations of rows from the two tables, assuming the database would allow you to omit the join condition)

Now, suppose you have one master with two details, for example, customers as master and customer_orders as detail1 and customer_phone_numbers. Suppose you want to retrieve a particular customer along with all is orders and all its phone numbers. You might be tempted to write:

SELECT     c.*, o.*, p.*
FROM       customers                c
INNER JOIN customer_orders          o
ON         c.id                   = o.customer_id
INNER JOIN customer_phone_numbers   p
ON         c.id                   = p.customer_id

This is valid SQL, and it will execute (asuming the tables and column names are in place) But the problem is, is that it will give you a rubbish result. Assuming you have on customer with two orders (1,2) and two phone numbers (A, B) you get these records:

customer-data | order 1 | phone A
customer-data | order 2 | phone A
customer-data | order 1 | phone B
customer-data | order 2 | phone B

This is rubbish, as it suggests there is some relationship between order 1 and phone numbers A and B and order 2 and phone numbers A and B.

What's worse is that these results can completely explode in numbers of records, much to the detriment of database performance.

So, JOIN is excellent to "flatten" a hierarchy of items of known depth (customer -> orders -> order_items) into one big table which only duplicates the master items for each detail item. But it is awful to extract a true graph of related items. This is a direct consequence of the way SQL is designed - it can only output normalized tables without repeating groups. This is way object relational mappers exist, to allow object definitions that can have multiple dependent collections of subordinate objects to be stored and retrieved from a relational database without losing your sanity as a programmer.

Roland Bouman