views:

102

answers:

4

I'm creating a sports statistics database. With it, I'd like to catalog game/match statistics for many types of sports. For example, this database would be able to tell you how many touchdowns the Carolina Panthers scored in the 09-10 season (football), or how many free throws were made by the Miami Heat in their last game (basketball).

I'm having trouble designing one of the more fundamental tables called Matches. The Matches table has columns for:

  • ID (PK *match_id*)
  • date of play (*play_date*)
  • IDs referring to the performances of the teams (FK *team_1_performance_id* and *team_2_performance_id*) in table Performances.

The Performances table holds:

  • ID (PK *perf_id*)
  • team ID (FK *team_id*)
  • And most importantly, all the other stats like: number of strikes (*)
  • average rushing yards per play (*)
  • percent of 3-pointers made (*)

(*)The problem is, how can I make the Performances table relevant to the respective sport? For example, baseball games have strikes, but soccer and hockey do not (nor does any other sport I can think of). I don't want my Performance table to have a column for strikes when its only going to be relevant for a portion of records.

Or do I? Perhaps my design should be different all together? How would you go about this?

Now, I don't know if this is possible, but one idea I had was to maybe include some kind of perfomance table ID column in Matches that refers to different performance tables. So that when I query a match's performances, it will look at a specific table. This is where the title of this question comes from (Can an attribute designate one table over another?). Imagine "SELECT team_1_performance.strikes FROM Matches INNER JOIN appropriate_performance_table AS team_1_performance WHERE Matches.performance_table_id = 'Baseball'" How could I designate appropriate_performance_table, if that's even possible?

And another idea I had was to create matches tables for all the sports, like Rugby_Matches or Football_Matches, and then respective performance tables for those sports, like Rugby_Perfomances or Football_Performances. This just seems like a lot of tables that represent somewhat similar things.

If you can, try to keep your responses MySQL specific.

Thanks!

+1  A: 

Your idea to create sports-specific tables is generally what is done.

Dave Markle
yes yes, wide not long... good plan.
Stephanie Page
+2  A: 

Instead of accross, create the data going down.

So you would have

The Performances table holds:

  • ID (PK *perf_id*)
  • team ID (FK *team_id*)
  • Performance Stat Type
  • Performance Stat Value

Or something like that.

You will then also have to create a Rules table, that will link specific Performance Stat Types to specific Sport Types.

This will then also allow you to easily add new Performance Stat Types without majorly impacting your database schema.

You can then also implement display orders, or even display groupings if you like.

astander
Your answer is along the same lines as Charles Bretana's. Thanks for the suggestion. Especially about the data going down, not across. I wasn't thinking that way, but it makes a lot of sense in this case.
reverebeer
Please don't do this. This is called an EAV... look up, on this site and others, all of the issues with EAV's. Consider this question. How many teams last year scored more touchdowns than interceptions.
Stephanie Page
+2  A: 

Create a "Metrics" (or "Stats") table, that defines the different things you will measure.

  Table Metrics
    MetricId int,
    MetericName (Runs Batted In, Touchdowns, FreeThrows, etc.)
    MetricAbbreviation Nullable?
    Sport (That Metric belongs to )

Then your MatchStatistics table will have

  Table MatchStatistics
    MatchId   
    MetricId
    MetricValue Decimal

The PK on this table would be MatchId and MetricId. You could also have a PlayerStatistics Table that would look similar, except it would have PlayerId instead of MatchId

Charles Bretana
This sounds great! And easy to implement. Thanks!
reverebeer
A: 

If you go with Astander here's the query you'd need for all the wins for Carolina when the scored more touchdowns then interceptions.

If you did it the right way, in columns, you'd see

SELECT * FROM Football_stats  fs
WHERE fs.team_fk = (something that resolves to Carolina)
fs.outcome = 'Win' And fs.touchdowns > fs.interceptions

in the EAV world you'd get

SELECT game_id FROM football_stats WHERE fs.team_fk = [Carolina] and stat_type = 'Outcome' and stat_value = 'Wins')
INTERSECT
SELECT game_ID FROM 
  (SELECT game_id, stat_value FROM football_stats WHERE fs.team_fk = [Carolina] and stat_type = 'Touchdown' ) tds,
  (SELECT game_id, stat_value FROM football_stats WHERE fs.team_fk = [Carolina] and stat_type = 'Interceptions') ints
WHERE
  tds.stat_value > ints.stat_value

And all that did was give you a list of game_ids that satisfy the query, if you want the rest of the values, like points for and against, it's whole new rounds through the data.

Stephanie Page
I just got done reading http://www.simple-talk.com/opinion/opinion-pieces/bad-carma/ after doing some research here on EAVs, like you suggested. It was frightening to realize that I was following that same dead-end road. I'm thinking you must have read it too based on your other comments.Sorry other answerers, but, I won't be building an EAV. They sound really good on paper, but using them, would be a nightmare. Read that article if you haven't to see what I mean.You, maam, answered best. Thanks for your invaluable comments.
reverebeer
Thank god. I wish all of the Randy's (read the link above) here on SO would just stop handing out this advice. It's dangerous and it's plain wrong.
Stephanie Page