views:

88

answers:

4

I have a table which holds flight schedule data. Every schedule have an effective_from & effective_to date. I load this table from flat file which don't provide me an effective_from and effective_to date. So at the time of loading I ask this information from user.

Suppose user gave from date as current date and to date as 31st March. Now on 1st March the user loads a new flight schedule and user give from date as current date and to date as 31st May.

If I query table for effective date between 1st March to 31st March the query returns me two records for each flight whereas I want only one record for each flight and this should be the latest record.

How do i do this? Should I handle this by query or while loading check and correct the data?

A: 

You need to identify the primary key for the data (which might be a 'business' key). There must be something which uniquely identifies each flight schedule (it sounds like it shouldn't include effective_from. Once that key is established, you check for it when importing and then either update the existing record or insert a new one.

Denis Hennessy
Denis I cannot update the record all the child tables are associated with the old records and to maintain that historical information we need to keep that.
Bhushan
A: 

I assume that each flight has some unique ID, otherwise how can make the difference between them. Then you can add to the schedule thable extra field "Active". When loading in new schedule - query first existing records with the same flight id and set them to Active=false. New record enter with Active=true.

Query is then simple: select * from schedule where active=1

Riho
A: 

I developed this solution but looking for even better solution if possible.

Table Schedule {
scheduleId, flightNumber, effective_from,effective_to
}

Data in Schedule table {
1, XYZ12, 01/01/2009, 31/03/2009
2, ABC12, 01/01/2009, 30/04/2009
}

Now user loads another record 3, XYZ12, 01/03/2009, 31/05/2009

select scheduleId from Schedule where flightNumber = 'XYZ12' and (effective_from < '01/03/2009' and effective_to > '01/03/2009' or effective_from < '31/05/2009' and effective_to > '31/05/2009')

If the above query returns me any result that means its an overlap and I should throw an error to user.

Bhushan
A: 

The problem description and the comment to one of the suggestions gives the business rules:

  • querying flights with an effective date should return only one record per flight
  • the record returned should be the latest record
  • previous schedules must be kept in the table

The key to the answer is how to determine which is the latest record - the simplest answer would be to add a column that records the timestamp when the row is inserted. Now when you query for a flight and a given effective date, you just get the result with the latest inserted timestamp (which can be done using ORDER BY DESC and take the first row returned).

You can do a similar query with just the effective date and return all flights - again, for each flight you just want to return the row that includes the effective date, but with the largest timestamp. There's a neat trick for finding the maximum of something on a group-by-group basis - left join the results with themselves so that left < right, then the maximum is the left value where the right is null. The author of High Performance MySQL gives a simple example of this.

It's much easier than trying to retroactively correct the older schedules - and, by the sound of things, the older schedules have to be kept intact to satisfy your business requirements. It also means you can answer historical questions - you can always find out what your schedule table used to look like on a given date - which means it's very handy when generating reports such as "This month's schedule changes" and so on.

Chris