ansaurus

Question

Best Schema to represent NCAA Basketball Bracket

Answer 1

+3 A:

For a RDBMS, I think the simplest approach that's still flexible enough to accommodate the majority of situations is to do the following:

Teams has [team-id (PK)], [name], [region-id (FK to Regions)], [initial-seed]. You will have one entry for each team. (The regions table is a trivial code table with only four entries, one for each NCAA region, and is not listed here.)
Participants has [game-id (FK to Games)], [team-id (FK to Teams)], [score (nullable)], [outcome]. [score] is nullable to reflect that a team might forfeit. You will have typically have two Participants per Game.
Games has [game-id (PK)], [date], [location]. To find out which teams played in a game, look up the appropriate game-id in the Participants table. (Remember, there might be more than two teams if someone dropped out or was disqualified.)

To set up the initial bracket, match the appropriate seeds to each other. As games are played, note which team has outcome = Winner for a particular game; this team is matched up against the winner of another game. Fill in the bracket until there are no more winning teams left.

John Feminella 2009-03-16 12:40:53

I like this, but how would you know, looking at the data, how to match teams up? For instance, you need to know if they are midwest, west, east or south, and would you just rely on the programming to know that a 1 seed plays a 16 seed? how do you know though for sure who the winner plays?

Ryan Guill 2009-03-16 12:45:07

Yes, I'd say that a rule like "the 1 seed plays 16 seed" is something that should be codified into business logic, not stored in the database. I did forget about the different regions, though; I'll amend that.

John Feminella 2009-03-16 12:47:48

More specifically, the rule is that if the highest-numbered seed is N, where N is a power of 2, then the seed numbered k plays the seed numbered N - k + 1 in the initial matchup. (For example, seed k = **4** plays seed 16 - 4 + 1 = **13** in a 16-seed matchup.)

John Feminella 2009-03-16 12:52:47

Right, but that rule only lets you figure out the starting bracket. How do you say that the winner of the 16 vs 1 plays the winner of the 8 vs 9?

Ryan Guill 2009-03-16 12:53:47

That's just a re-application of the rule with N being half as large as it was before. Do you see why?

John Feminella 2009-03-16 12:56:38

If not, just assign a number i = 1 ... N/2 to each matchup. In the next round, the winner of matchup i plays the winner of i + 1 if is i is odd, or the winner of i - 1 if i is even.

John Feminella 2009-03-16 13:02:29

At some point you're going to have to include code in your solution. You seem to be asking how to put matching logic in the db. You don't. Put the logic in code, store enough data in the db to drive your code. Problem solved, where's my bounty.

Peter Seale 2009-03-23 18:09:47

@Peter: Yep; that's why I said "Yes, I'd say that a rule like "the 1 seed plays 16 seed" is something that should be codified into business logic, not stored in the database."

John Feminella 2009-03-23 21:44:38

Anyone factoring the play-in game into this? Or is that out of scope? :)

Lurker Indeed 2009-03-24 22:20:08

Answer 2

+3 A:

Since you didn't specify RDBMS, I'm gonna be a little different and go with a CouchDB approach since I was reading about that this weekend. Here's the document structure I've come up with a represent a game.

{
  "round" : 1, //The final would be round 5, and I guess Alabama St. vs. Morehead would be 0
  "location" : "Dayton, OH",
  "division": "South",
  "teams" : ["UNC", "Radford"]  //A feature of Couch is that fields like teams don't need a fixed nuber of columns.
  "winner" : "UNC"  //Showing my bias
}

A more interesting or complete application might have data for teams, rankings, and the like stored somewhere as well. John's approach covers that angle well, it seems. I welcome any comments from people who know better on my Couch skills.

Jeremy DeGroot 2009-03-16 12:59:32

Answer 3

+2 A:

The natural inclination is to look at a bracket in the order the games are played. You read the traditional diagram from the outside in. But let's think of it the other way around. Each game is played between two teams. One wins, the other loses.

Now, there's a bit more to it than just this. The winners of a particular pair of games face off against each other in another game. So there's also a relationship between the games themselves, irrespective of who's playing in those games. That is, the teams that face off in each game (except in the first round) are the winners of two earlier games.

So you might notice that each game has two "child games" that precede it and determine who faces off in that game. This sounds exactly like a binary tree: each root node has at most two child nodes. If you know who wins each game, you can easily determine the teams in the "parent" games.

So, to design a database to model this, you really only need two entities: Team and Game. Each Game has two foreign keys that relate to other Games. The names don't matter, but we would model them as separate keys to enforce the requirement that each game have no more than two preceding games. Let's call them leftGame and rightGame, to keep with the binary tree nomenclature. Similarly, we should have a key called parentGame that tracks the reverse relationship.

Also, as I noted earlier, you can easily determine the teams that face off in each game by looking at who won the two preceding games. So you really only need to track the winner of each game. So, give the Game entity a winner foreign key to the Team table.

Now, there's the small matter of seeding the bracket. That is, modeling the match-ups for the first round games. You could model this by having a Game for each team in the overall competition where that team is the winner and has no preceding games.

So, the overall schema would be:

Game:
    winner: Team
    leftGame: Game
    rightGame: Game
    parentGame: Game
    other attributes as you see fit

Team:
    name
    other attributes as you see fit

Of course, you would add all the other information you'd want to the entities: location, scores, outcome (in case the game was won by forfeit or some other out of the ordinary condition).

Alex 2009-03-24 20:11:43

Answer 4

+2 A:

I created a small system with the following tables:

Games: GameId, TournId, RoundId, Sequence, Date, VisitorId, VisitorScore, HomeId, HomeScore, WinnerId, WinnerGameId, WinnerHome (bit)

Predictions: PredId, UserId, GameId, PredVisitorId, PredHomeId, PredWinnerId

Rounds: RoundId, TournId, RoundNum, Heading1, Heading2

Teams: TeamId, TournId, TeamName, Seed, MoreInfo, Url

Tournaments: TournId, TournDesc

Users: TournId, UserName

WinnerGameId connects the winner of a game to their next game. WinnerHome tells whether the winner is the home or visitor of that next game. Other than that, I think it's pretty self explanatory.

FrobozzJ 2009-03-24 20:29:01

This is the best one yet.

Lurker Indeed 2009-03-24 22:27:48

Answer 5

+1 A:

4 tables:

Team(Team, Region, Seed)

User(UserId, Email, blablabla)

Bracket(BracketId, UserId, Points)

Pick(BracketId, GameId, Team, Points)

Each bracket a person submits will have 63 rows in the Pick table.
After each game is played you would update the pick table to score individual picks. Points field in this table will be null for game not yet played, 0 for an incorrect pick or positive number for correct pick. GameId is just a key identifying where in that users bracket this pick goes (ex: East_Round2_Game2, FinalFour_Game1).

The points column in the bracket table can be updated after each update of the pick table so it contains the sum of points for that bracket. The most looked at thing will be the standings, don't want to re-sum those every time someone wants to view the leader board.

You don't need to keep a table with all the games that actually get played or their results, just update the pick table after each game. You can even do the bracket highlighting of correct/incorrect picks by just looking at the Points column in the pick table.

Keith 2009-03-24 22:05:43

Answer 6

+1 A:

Blake Taylor 2009-03-25 07:51:37

Your name is Blake Taylor and you're answering a question about the NCAA basketball tournament? That's quite a coincidence, unless it's not. :)

Michael Myers 2009-03-25 21:22:21

Answer 7

A:

I use the same schema for all of my databases.

t
--------
1 guid PK
2 guid FK
3 bit

Then in my code:

select [2],[3] from t where [1] = @1

@1 is the id of the data I am fetching. Then if [2] is not null, I select again, setting @1 to [2].

This makes it really easy to model the situation you posted.

Shawn Simon 2009-03-26 02:31:48

ansaurus

tags:

views:

answers:

Best Schema to represent NCAA Basketball Bracket

related questions