tags:

views:

141

answers:

2

Hey Everyone,

I have limited SQL background, basically a small amount of manipulation through HTML and mostly with pre-existing databases. What I am trying to do is set up a database that will store time information for bus routes. So basically I have different routes with stops for each route and then a list of times that the bus arrives at each stop. Here is an example of a table of times from their website: Link.

I am wondering what would be the best way to layout my database/tables?

Also what is the purpose of the _id field in each table?

Thanks, Rob!

P.S. Sorry if my lack of knowledge on the subject has caused me to post a duplicate question.

+1  A: 

The id field provides a unique id to each row, that can be referenced by other tables in the database. When it is referenced this way, it is called a foreign key. This allows you to have one table with detailed information about say a student (name, address, etc) and then uniquely reference that student in another table that talks about members of a class.

There are generally many different ways to lay out a database, which is best really depends on what you want to do with the data. Most of this design is not unique to SqlLite, so reading up on relational databases in general might be worthwhile.

One way to lay it out your specific example:

Route table: id, name

Stop table: id, name

Time table: id, foreign key to route, foreign key to stop

Mayra
A: 

You asked two questions:

  1. The 'id' field is a good convention as a unique row identification. It is not strictly required, and some data that has its own unique serial number for each row will use that serial number instead. Programs often want to look up the entire record by the unique row identification so the field should be a primary key. The commonly used clause for creating the id is "id INT NOT NULL AUTO_INCREMENT, PRIMARY KEY(id)". A unique id will be assigned automatically whenever you add a new record. See Tizag's MySQL Tutorial for a verbose breakdown. Personally, some name these "_id", e.g., "route_id", "bus_id", and your style may vary.

  2. Laying out your data depends a bit on what you want to do with it. My guess is that for each bus route, you want to find the next bus at a particular stop after some time and then when that bus arrives at some next stop. Each route has many stops; a stop might be shared among routes; a route varies during the day by skipping some stops; routes go in two directions; etc. I would add the abstraction of a 'run' which one bus run through the stops from start to finish.

My first guess at laying it out would be:

A bus route has a bus name, number, and other info: "#1 Red", "South And West", "SF Muni", "map://..."

A stop has a unique name and other info: "North Grand Mall". "South", "Under the cherry, west of grand avenue."

A run has a route number, days of the week, a direction, maybe a sequence number of the run in that direction that day, maybe a start/stop time for the run: "63 (route_id for #1 Red)", "reverse direction", "MTWRF" (weekdays), "2" (2nd run), "7:00", "7:36".

Then you have a schedule table of "Run_id", "Stop_id", "Stop Sequence Number", "Time".

Write out your planned schema, and then start trying to write schemas against them. Don't be surprised by a couple multiple joins "SELECT * from Schedule as S, Run as R, Bus as B for B ... WHERE B.id == R.bus_id and S.run_id == R.id and B.id = (requested bus) and S.time >= now() and ...." You denormalize for performance after it works.

That's all my free advice. :)

Charles Merriam