views:

101

answers:

2

I'm building a website where players can play a turn based game for virtual credits (like a Poker site, but different). The setup I came up with:

  • One data server which contains all player accounts with associated data (a database + service). Database and API may be split into two servers, if that helps.
  • One or more webservers which serve the website, connecting to the data server when needed.
  • One lobby server where players can find eachother and set up games (multiple is possible, but less user friendly)
  • Multiple game servers where the game is run (all rules and such are on the server, the client is just a remote control and viewer), with one load balancer.
  • A game client

The client will be made with Flash, the webserver will use PHP. The rest is all Java.

Communications

  • Player logs in on the site. Webserver sends username/password to data server, which creates a session key (like a cookie)
  • Player starts the client. Client connects to lobby server, passing the session key. Lobby server checks this key with the data server
  • Once a lobby is created and a game must start, the lobby server fetches a game server from the load balancer and sets up a game on this game server.
  • Lobby server tells the clients to connect to the game server and the game is played.
  • When the game is finished, the game server lets the lobby server know. The lobby server will check the score and update the credits in the data server.

Protocols:

  • Java to Java: RMI
  • PHP or Flash to Java: Custom binary protocol via socket. This protocol supports closing the socket when idle while keeping the virtual connection alive and resumable.

If the client has his wishes, the site will need to support thousands of concurrent players. With this information, can you see any bottlenecks in my setup? I'm personally a little bit worried about the existence of only one data server, but I'm not sure how to split that up. Other scalability (or other) remarks are also welcome.

+3  A: 

Your architecture has a lot of single services that are crucial for ANY part of the system to work for ANY user. I considers these SPOFs.

  • You might want to consider sharding (or horizontal partitioning) for your data server.
  • Consider multiple lobby servers. The flash client can still disguise them as a single lobby, if you want to. Personally, I don't like playing games with people I cannot talk to in any language I don't understand. Also, I don't like joining a lobby server finding n-thousand games and not knowing anyone. Make multiple lobbies a feature (when you put thought into it, you really can). There's no real use for a lobby with 10000 people. If you still wanna go through with it, you could still try partitioning, based on the assumption that a player filters for specific parameters (opponent level, game type, etc.), trying to split lobbies along one or even multiple criteria.
  • The load balancer doesn't actually require enough power to be a physical server I suppose. Why not replicate it on all lobby servers? All it has to know is availability / server. Assuming, you have 10000 game servers (which I think is a whole fucking lot in this case) and a refresh rate of 1 second (which is far more than enough here), all you sync is 10000 integers per second (let's assume you can represent availability as a number (which I suppose you can)). If you figure out something better than connecting every game server with every lobby server, this doesn't even require too many connections on a single machine.

In this type of application, I think horizontal partitioning is a good idea, because for one it can be done easily and adds reliability to the system. Assume your SPOFs are partitioned, rather than redundant. This is easier and possibly cheaper. If a part of an SPOF goes down (let's say 1 of your 20 independent and physically distributed data servers), this bad, because 5% of your players are locked out. But probably it will get up some time soon. If your SPOF is redundant, chances are lower that anything fails. But if it does, EVERYBODY is locked out. This is an issue, because you'll have everybody trying to get back online all at the same time. Once your SPOF is back, it'll be hit by an amount of request orders of magnitude higher than it has to handle usually. And you can still employ horizontal partitioning and redundancy at the same time, as proposed for the balancing service.

back2dos
+1  A: 

Having worked on a couple of facebook games, I would say this:

Be thinking about scalability for thousands of players, but you have to get tens of thousands of players before the effort of scaling for those players will pay off.

That is to say, plan ahead, but worry about getting 1 player before you plan a system for thousands of concurrent players.

I suspect that the setup you describe will perform pretty well for your initial user-base. While you are building, avoid doing things like: Requiring the login server to talk to the lobby server. Make each server stand on it's own, the big thing that will kill you is inter-dependency between services.

But the most important thing, is to get it done in the most expedient way you can. If you get enough users to tax your system, that will be a really good thing. You can hire a DBA to help you figure out how to scale out when you have that many users.

Aaron H.
I'm certainly going the agile way here and thousand of concurrent users is a thing of the future. Initially all these services will run on one box. Still, I'd like to know what I can expect so I don't program myself into a corner.
Bart van Heukelom