views:

52

answers:

5

I have one table GAMES and another PLAYERS. Currently each "game" has a column for players_in_game but I have nothing reciprocating in the PLAYERS table. Since this column is an array (Comma separated list of the player's ID #s) I'm thinking that it would probably be better to have each player's record also contain a list of the games they are a member of. On the other hand, duplicating the information in two separate tables might actually require more DB calls.

For perspective, there aren't likely to be more then a dozen players in a game (generally 4-6 is the norm) but there could potentially be a large number of games.

Is there a good way to figure out which would be more efficient? Thanks.

+5  A: 

Normalization is generally a good thing. Comma delimited lists in tables is a sign that a table is in desperate need of a foreign key. If you're worried about extra queries, check out JOINING

dbo.games
+----+----------+
| id |   name   |
+----+----------+
| 1  |    war   |
| 2  | invaders |
+----+----------+

dbo.players
+----+----------+---------+
| id |   name   | game_id |
+----+----------+---------+
| 1  |   john   |    1    |
| 2  |   mike   |    1    |
+----+----------+---------+

SELECT games.name, count(players.id) as total_players FROM games INNER JOIN players ON games.id = players.game_id GROUP BY games.name;

Result:
+-----------+--------------+
| name      |total_players |
+-----------+--------------+
| war       |       2      |
| invaders  |       0      |
+-----------+--------------+

Sidenote: Go Hokies :)

Mike B
Just one game per player? I thought it'd be m:n - a GAME can have many PLAYERs, and a PLAYER can participate in many GAMEs.
duffymo
So if a player can be in multiple games (EG John is in War and Invaders) would it be better to create a new table `PlayersInGames` as suggested in some of the other answers?
aslum
@aslum absolutely. This is commonly known as a join table or many-to-many relationship. The same foreign key principles apply and an extra `JOIN` would be required. Performance penalties would be negligible unless you have millions upon millions of records. On that note, you should always index foreign keys.
Mike B
Thanks for the answers Mike!
aslum
+1  A: 

The right answer is a table called PlayersInGames that has a player id and a game id per row.

Mike Burton
+1  A: 

I would create a third table that links the players and games. Your comma-delimited list is effectively a third table, but parsing your list is almost certainly going to be less efficient than letting the database do it for you.

Gerald
A: 

Ask yourself what happens if you remove a row from the GAME table. Now you'll have to loop over all the PLAYER rows, parse the list, figure out which ones contain a reference to the removed GAME, and then update all the lists.

Bad design. Let SQL do what it was born for. The query will be fast enough if you index it properly. Micro-optimizations like this are the wrong approach.

duffymo
+2  A: 

Oh god, please don't use CSVs!! I know it's tempting when you're new to SQL, but it becomes unqueryable...

You need 3 tables: games, players, and players_in_games. games and players should each have a primary auto-incrementing key like id, and then players_in_games needs just two fields, player_id and game_id. This is called a "many to many" relationship. A player can play many games, and a game can have many players.

Mark
Never mind for the moment that the CSVs have been giving me headaches, what is so bad about them? (Should I ask that as a separate question?)
aslum
A SELECT against a comma delimited string value won't perform. And god forbid you ever have to do something like inject an id in the middle of a comma delimited using pure SQL -- you'll end up with an obscene set of substrings and other sql method calls. There are a couple of related examples in the comments here: http://thedailywtf.com/Articles/Pretty-Simple.aspx
Frank Farmer
Isn't the headaches enough of a reason? Fields should contain atomic facts about the data. How do you get a list of all the players in a game? Easy enough, you have the CSV list. Now how do you get a list of all games a player is in? Even if they can only be in a single game at a time, good luck finding it. What about moving a player from one game to another? CSV is a way of faking sets, and there is no need to do that in a set-oriented language.
Duncan