views:

308

answers:

6

Hypothetical example:

I have Cars and Owners. Each Car belongs to one (and only one) Owner at a given time, but ownership may be transferred. Owners may, at any time, own zero or more cars. What I want is to store the historical relationships in a MySQL database such that, given an arbitrary time, I can look up the current assignment of Cars to Owners.

I.e. At time X (where X can be now or anytime in the past):

  • Who owns car Y?
  • Which cars (if any) does owner Z own?

Creating an M:N table in SQL (with a timestamp) is simple enough, but I'd like to avoid a correlated sub-query as this table will get large (and, hence, performance will suffer). Any ideas? I have a feeling that there's a way to do this by JOINing such a table with itself, but I'm not terribly experienced with databases.

UPDATE: I would like to avoid using both a "start_date" and "end_date" field per row as this would necessitate a (potentially) expensive look-up each time a new row is inserted. (Also, it's redundant).

A: 

cars table can have an id called ownerID, YOu can then simply

1.select car from cars inner join owners on car.ownerid=owner.ownerid where ownerid=y

2.select car from cars where owner=z

Not the exact syntax but simple pseudo code.

CodeToGlory
That's fine for the state as of now, but what if I want to see the assignment, say, a week ago?
D Carney
+7  A: 

Make a third table called CarOwners with a field for carid, ownerid and start_date and end_date. When a car is bought fill in the first three and check the table to make sure no one else is listed as the owner. If there is then update the record with that data as the end_date.

To find current owner:

select carid, ownerid from CarOwner where end_date is null

To find owner at a point in time:

select carid, ownerid from CarOwner where start_date < getdate()
and end_date > getdate()

getdate() is MS SQL Server specific, but every database has some function that returns the current date - just substitute.

Of course if you also want additional info from the other tables, you would join to them as well.

select co.carid, co.ownerid, o.owner_name, c.make, c.Model, c.year  
from  CarOwner co
JOIN Car c on co.carid = c.carid
JOIN Owner o on o.ownerid = co.ownerid
where co.end_date is null
HLGEM
The problem with this is that it requires a (potentially expensive) lookup on insertions. I'd rather not use an "end_date" field for this reason (also, it's redundant).
D Carney
lose end_date, to get current owner: `select ownerid from carowner where carid=XX order by start_date desc limit 1` to get owner at a given date: `select ownerid from carowner where carid=XX and start_date<=YY order by start_date desc limit 1`
Javier
what's the problem on insertions? i dont' get it
Javier
@Javier: I like the idea in your comment, an answer of your own would have earned you at least my upvote.
mghie
ifd you properly use a datetime datatype and put an index on start and send date it is not an expensive lookup at all. I don't see where it is redundant either.
HLGEM
The problem with Javier's idea is that you can only search for 1 owner of one car at a time, my solution is much more flexible.
HLGEM
A: 

Given your business rule that each car belongs to at least one owner (ie. owners exist before they are assigned to a a car) and your operational constraint that the table may grow large, I'd design the schema as follows:

(generic sql 92 syntax:)

CREATE TABLE Cars
(
CarID integer not null default autoincrement,
OwnerID integer not null, 
CarDescription varchar(100) not null,
CreatedOn timestamp not null default current timestamp,
Primary key (CarID),
FOREIGN KEY (OwnerID ) REFERENCES Owners(OwnerID )

)

CREATE TABLE Owners
(
OwnerID integer not null default autoincrement,
OwnerName varchar(100) not null,
Primary key(OwnerID )
)

CREATE TABLE HistoricalCarOwners
(
CarID integer not null,
OwnerID integer not null,
OwnedFrom timestamp null,
Owneduntil timestamp null,
primary key (cardid, ownerid),
FOREIGN KEY (OwnerID ) REFERENCES Owners(OwnerID ),
FOREIGN KEY (CarID ) REFERENCES Cars(CarID )
)

I personally would not touch the third table from my client application but would simply let the database do the work - and maintain data integrity - with ON UPDATE AND ON DELETE triggers on the Cars table to populate the HistoricalCarOwners table whenever a car changes owners (i.e whenever an UPDATE is committed on the OwnerId column) or a car is deleted.

With the above schema, selecting the current car owner is trivial and selecting historical car owners is a simple as

select ownerid, ownername from owners o inner join historicalcarowners hco 
                                        on hco.ownerid = o.ownerid
                                        where hco.carid = :arg_id and
                                              :arg_timestamp between ownedfrom and owneduntil
order by ...

HTH, Vince

Vincent Buck
A: 

If you really do not want to have a start and end date you can use just a single date and do a query like the following.

SELECT * FROM  CarOwner co 
WHERE  co.CarId = @CarId 
      AND co.TransferDate <= @AsOfDate 
      AND NOT EXISTS (SELECT * FROM CarOwner co2 
       WHERE  co2.CarId = @CarId 
          AND co2.TransferDate <= @AsOfDate 
         AND co2.TransferDate > co.Transferdate)

or a slight variation

SELECT * FROM  Car ca 
    JOIN CarOwner co ON ca.Id = co.CarId 
  AND co.TransferDate = (SELECT MAX(TransferDate) 
         FROM CarOwner WHERE CarId = @CarId 
                                             AND TransferDate < @AsOfDate)
WHERE co.CarId = @CarId

These solution are functionally equivalent to Javier's suggestion but depending on the database you are using one solution may be faster than the other.

However, depending on your read versus write ratio you may find the performance better if you redundantly update the end date in the associative entity.

Darrel Miller
+2  A: 

I've found that the best way to handle this sort of requirement is to just maintain a log of VehicleEvents, one of which would be ChangeOwner. In practice, you can derive the answers to all the questions posed here - at least as accurately as you are collecting the events.

Each record would have a timestamp indicating when the event occurred.

One benefit of doing it this way is that the minimum amount of data can be added in each event, but the information about the Vehicle can accumulate and evolve.

Also, with the timestamp, events can be added after the fact (as long as the timestamp accurately reflects when the event occurred.

Trying to maintain historical state for something like this in any other way I've tried leads to madness. (Maybe I'm still recovering. :D)

BTW, the distinguishing characteristic here is probably that it's a Time Series or Event Log, not that it's 1:m.

le dorfier
+1  A: 

Why not have a transaction table? Which would contain the car ID, the FROM owner, the TO owner and the date the transaction occcured.

Then all you do is find the first transaction for a car before the desired date.

To find cars owned by Owner 253 on March 1st:

SELECT * FROM transactions WHERE ownerToId = 253 AND date > '2009-03-01'

Rowan Parker