views:

201

answers:

5

I'm new to databases, but I think I finally have a situation where flat files won't work.

I'm writing a program to analyze the outcomes of multiplayer games, where each game could have any number of players grouped into any number of teams. I want to allow players can win, tie, or leave partway through the game (and win/lose based on team performance).

I also might want to store historical player ratings (unless it's faster to just recompute that from their game history), so I don't know if that means storing each player's rating alongside each game played, or having a separate table for each player, or what.

+3  A: 

You didn't mention any need for locking mechanisms where multiple users may be competing to write the same data to the same resource (a database record or file in the case of flat files) simultaneously. What I would suggest is get a good book on database design and try to understand normalization rules in depth. Distributing data across separate tables have a performance impact, but they also have an effect on the ease-of-use of query construction. This is a very involving topic, and there's no simple answer to it. That's why companies hire database administrators to keep their data structures optimized.

You might want to look at SQLite, if you need a lightweight database engine.

stillstanding
+1 for the theory and the database of first resort
msw
I'm definitely interested in learning the theory, too. The reason I ask is, wouldn't an SQL database limit the number of players I can have in a game because it fixes the number of columns I have? And if it does, would relational database theory apply to other database types? Or would I want to learn relational theory as a prereq for other kinds?
drhorrible
relational is the dominant DBM technology and has been for the last 40+ years. The field is well trodden, both academically and practically, and I can't imagine skipping it as a prerequisite. "Structured Storage" is the new kid on the block and people are still trying to figure out how to make best use of it (cf. http://en.wikipedia.org/wiki/NoSQL) I would not suggest it as a first step for a DBM neophyte.
msw
If you learn the RDBMs concept of normalization, you will find that limited columns are not so limiting. Prior to Google and systems of that era, RDBMs were the largest data on the planet (e.g. global airline reservations, banking, etc. etc.) and are not likely to be displaced soon.
msw
@dhorrible - the number of players would not be limteid by fields as you would not give every player a field. Learn db theory and "normalization".
TomTom
OK, I'm convinced I have a lot to learn. Unless someone has major objections, I think I'll buy "Database in Depth" (http://oreilly.com/catalog/9780596100124) and see where that leads me.
drhorrible
@dr TomTom is exactly right, you should NOT put one player per column. You would always store players in rows with a column indicating PlayerID.
Emtucifor
+7  A: 

I don't see any criteria that impacts database choice, but I'll list the free ones:

I don't recommend an embedded database like SQLite, because embedded databases make trade-offs in features to accommodate space & size concerns. I don't agree with their belief that data typing should be relaxed - it's lead to numerous questions on SO about about to deal with date/time filtration, among others...

You'll want to learn about normalization, getting data to Third Normal Form (3NF) because it enforces referential integrity, which also minimizes data redundancy. For example, your player stats would not be stored in the database - they'd be calculated at the time of the request based on the data onhand.

OMG Ponies
I'm not so sure that player stats shouldn't be stored in the database. If there's some way to "close out" a group of stats so it can't be modified any more, then precalculating the results can be useful. Think of your bank account. If you've had it open for 10 years, they're hardly going to start from 0 and sum up all the transactions you've had in all that time to get your current balance. Instead, they can store your current balance as the "starting point" and go backwards if necessary. Idealistically it would be nice to never store calculations like this, but realistically it's different.
Emtucifor
MySQL is sometimes free, sometimes not : it's dual licence :(
Hugues Van Landeghem
Firebird embedded have exactly same features than Firebird Server so where are trade-offs ?
Hugues Van Landeghem
+3  A: 

Some good options were mentioned already, but I really think that on Java platform, H2 is a very good choice. It is perfect for testing (in-memory test database), but works very well also for embedded use cases and as stand-alone "real database". Plus it is easy to export as dump file, import from that, to move around. And works efficiently too. It is developed by a very good Java DB guy, and is not his first take, and you can see this from maturity of the project. On top of this it is still being actively developed as well as supported.

StaxMan
Thanks! I'll definitely look into this one, once I know how to handle databases in general (because apparently, I really don't!)
drhorrible
+1 I've had excellent experiences with H2. The author is also an SO participant.
trashgod
A: 

Try also OrientDB. It's free (Apache 2 license), run everywhere, supports SQL and it's really fast. Can insert 1,000,000 of records in 6 seconds on common hw.

Lvca
Please post full disclosure about your affiliation, as mentioned in the [FAQ](http://stackoverflow.com/faq) (mentioning it in your profile would also be a good idea).
Pascal Thivent
Done in my profile
Lvca
A: 

A word on why nobody even mentions any of the "NoSQL" databases while you have used it as a tag:

Non-SQL databases are getting a lot of attention (or even outright hype) recently, because of some high-profile usecases, because they're new (and therefore interesting), and because their promise of incredible scalability (which is "sexy" to programmers). However, only a very few very big players actually need that kind of scalability - and you certainly don't.

Another factor is that SQL databases require you to define your DB schema (the structure of tables and columns) beforehand, and changing it is somewhat problematic (especially if you already have a very large database). Non-SQL databases are more flexible in that regard, but you pay for it with more complex code (e.g. after you introduce a new field, your code needs to be able to deal with elements where it's not yet present). It doesn't sound like you need this kind of flexibility either.

Michael Borgwardt