views:

163

answers:

2

I am starting a new project trying to create a web application for an electronic programme guide for TV. This is not going to be a large scale project, but just something that I can use to learn web application programming. I am going to be using PHP for this.

Before beginning coding for the application, there is a crucial question that is bugging me. Since the application would involve large use of database and/or XML files for data storage, I am confused how to implement the architecture for the application. Please bear with me for this beginner question.

How should I be implementing the architecture of the application?

For example, there are going to be around 50 channels with (50 * N) number of shows. What I am thinking is:

  • Put the show description in an XML file.
  • Put the channel names in a table in the database.
  • Put the show names and IDs in another tables and fetch show description from the XML document.

What the above architecture lacks is how to actually implement the time-tracking. I mean I know when a particular show starts and ends but where do I "best" store that information? In the database or in the XML file? And how do I "best" display the information?

Do you have a better suggestion than the above architecture?

+4  A: 

From the kind of data that I assume TV programme guides store, it really looks like you can store everything inside a relational database. I see no advantage in using the filestsytem or XML files.

The queries for time-tracking should be very straightforward in SQL.

You could consider using a schema such as the following (using MySQL in this example):

CREATE TABLE shows (
   show_id int NOT NULL PRIMARY KEY,
   name varchar(100),
   description text
) ENGINE=InnoDB;

CREATE TABLE channels (
   channel_id int NOT NULL PRIMARY KEY,
   name varchar(100)
) ENGINE=InnoDB;

CREATE TABLE channel_slots (
   slot_id int NOT NULL PRIMARY KEY,
   channel_id int NOT NULL,
   day date NOT NULL,
   show_id int NOT NULL,
   start datetime,
   end datetime,
   FOREIGN KEY (channel_id) REFERENCES channels(channel_id),
   FOREIGN KEY (show_id) REFERENCES shows(show_id)
) ENGINE=InnoDB;

The shows table should define each show. show_id is a surrogate key, and you can even make it generate a unique serial number automatically. The name field is just name field, and the description field has a text data type which can store a variable amount of text.

The channels table should be quite straightforward. Again we're using a surrogate key as a channel_id. I'm not sure if channels have some unique standard code that can be used as a natural key instead, but you should be safe with a surrogate key.

Then the channel_slots table allocates show slots to each day of each channel.

I might be wrong, but I think most TV programme guides do not strictly define a day as starting and ending at midnight. Sometimes the day might at at 2:00am of the following day, and a programme that start at 1.30am and ends at 2.00am would make part of that day. If this is the case, that's the reason for using a day field in this table. In this field we can define the day that this show belongs to, in terms of "programme guide days".

The slot_id is again a surrogate key, and the channel_id and show_id fields are foreign keys to the relevant table. The start and end fields simply define the accurate start and end times of the show. If you will be inserting shows where the show times are not yet defined, you may want to insert NULL in these fields. Another option could be to use another field as a flag to mark if show times are confirmed of not.

If you are going to use MySQL as your DBMS, note that the InnoDB storage engine supports foreign key constraints, while the default MyISAM engine does not. However, only the MyISAM engine supports full text indexing. This may be useful if you intend to allow your users to search for text inside the show descriptions.

To give you an example of the above schema, let's populate some data in it:

INSERT INTO shows VALUES (1, 'Breakfast Show', 'The everyday morning show');
INSERT INTO shows VALUES (2, 'Who wants to be a Millionaire?', 'Who does not?');
INSERT INTO shows VALUES (3, 'Saturday Night Live', 'Only on Saturdays');

INSERT INTO channels VALUES (1, 'Channel 1');

INSERT INTO channel_slots VALUES(
   1, 1, '2010-07-17', 1, '2010-07-17 07:00:00', '2010-07-17 09:00:00');

INSERT INTO channel_slots VALUES(
   2, 1, '2010-07-17', 2, '2010-07-17 18:00:00', '2010-07-17 19:00:00');

INSERT INTO channel_slots VALUES(
   3, 1, '2010-07-17', 3, '2010-07-17 23:30:00', '2010-07-18 01:00:00');

This is how our tables look like now:

mysql> SELECT * FROM channels;
+------------+-----------+
| channel_id | name      |
+------------+-----------+
|          1 | Channel 1 |
+------------+-----------+
1 row in set (0.00 sec)

mysql> SELECT * FROM shows;
+---------+--------------------------------+---------------------------+
| show_id | name                           | description               |
+---------+--------------------------------+---------------------------+
|       1 | Breakfast Show                 | The everyday morning show |
|       2 | Who wants to be a Millionaire? | Who does not?             |
|       3 | Saturday Night Live            | Only on Saturdays         |
+---------+--------------------------------+---------------------------+
3 rows in set (0.00 sec)

mysql> SELECT * FROM channel_slots;
+---------+------------+------------+---------+---------------------+---------------------+
| slot_id | channel_id | day        | show_id | start               | end                 |
+---------+------------+------------+---------+---------------------+---------------------+
|       1 |          1 | 2010-07-17 |       1 | 2010-07-17 07:00:00 | 2010-07-17 09:00:00 |
|       2 |          1 | 2010-07-17 |       2 | 2010-07-17 18:00:00 | 2010-07-17 19:00:00 |
|       3 |          1 | 2010-07-17 |       3 | 2010-07-17 23:30:00 | 2010-07-18 01:00:00 |
+---------+------------+------------+---------+---------------------+---------------------+
3 rows in set (0.00 sec)

Now let's say that the time now is 2010-07-17 17:45:00 and you want to display what is the next show on Channel 1:

SELECT    s.name, cs.start, cs.end 
FROM      channel_slots cs
JOIN      shows s ON (s.show_id = cs.show_id)
WHERE     cs.start > NOW()
ORDER BY  cs.start
LIMIT     1;

Result:

+--------------------------------+---------------------+---------------------+
| name                           | start               | end                 |
+--------------------------------+---------------------+---------------------+
| Who wants to be a Millionaire? | 2010-07-17 18:00:00 | 2010-07-17 19:00:00 |
+--------------------------------+---------------------+---------------------+
1 row in set (0.00 sec)

Then the following query displays the remaining schedule of the day for Channel 1:

SELECT    s.name, cs.start, cs.end 
FROM      channel_slots cs
JOIN      shows s ON (s.show_id = cs.show_id)
WHERE     cs.start > NOW() AND
          cs.day = '2010-07-17'
ORDER BY  cs.start;

Result:

+--------------------------------+---------------------+---------------------+
| name                           | start               | end                 |
+--------------------------------+---------------------+---------------------+
| Who wants to be a Millionaire? | 2010-07-17 18:00:00 | 2010-07-17 19:00:00 |
| Saturday Night Live            | 2010-07-17 23:30:00 | 2010-07-18 01:00:00 |
+--------------------------------+---------------------+---------------------+
2 rows in set (0.00 sec)

And so on. I hope this gets you going in the right direction. You should also make sure to research on database indexes, an important topic not covered in my answer.

Daniel Vassallo
Indeed, the only reason not to is when worried about stress on getting related data from the database, in which case just cache the more complex objects built from the database information in something like memcache or apc.
Wrikken
A: 

There's a couple of "streams" of thought which you need to keep in mind:

  • "Business issues" (how to make a cool TV web app - focused on business value)
  • "Architecture issues" - how to structur the app / cool technologies.
  • "Programming issues" - which code to write / cool technologies to use. (how to make a cool web app - focused on technical self learning)

First, rank these in order of priority. If this is just about learing PHP then it doesn't really matter too much (in the short term) about whether its a good "TV web app" or not.

If the business side is important (for any reason) then you'll need to give it due dilligence, in which case I'd start by thinking about how the data hangs together. Draw some simple conceptual models on paper and pinm them up on your wall where your working (or, do them on a white-board, takje a photo of them (say using your phone) and print out a copy).

You're going to have entities that are related (channels, shows, times). Once you have the high level sorted out, start working at a lower level - probably by designing database tables on paper.

This is when Architecture vs Programming comes into play; if it were me I'd stick all the data in one back-end system - say a relational database - because all the data is related it's better to keep it all in the same place:

  • You only need to develop one DAL (Data Access Layer)
  • You can ask questions that span various related entities (get me all shows between 2 - 5pm for all channels).

Alternatively, if this is an exercise is learning technical skills I can see why you'd want to have XML here and a DB here - because it gives you a greater range of things you'll learn.

Architecture can help you here (and is another great skill to learn). If it were me (and I was doing this in .Net) I'd abstract all the data access out behind an interface(s) - that way I could have as many different physcial DAL's as I wanted; have a look at the Dependency Inversion Principle

Adrian K