views:

93

answers:

4

One database connection is equal to one web request (in case, of course, your client reads the database on each request). By using a connection pool these connections are pre-created, but they are still used one-per-request.

Now to some numbers - if you google for "Tomcat concurrent connections" or "Apache concurrent connections", you'll see that they support without any problem 16000-20000 concurrent connections.

On the other hand, the MySQL administrator best practices say that the maximum number of concurrent database connections is 4096.

On a quick search, I could not find any information about PostgreSQL.

Q1: is there a software limit to concurrent connections in PostgreSQL, and is the one of MySQL indeed 4096

Q2. Do I miss something, or MySQL (or any db imposing a max concurrent connections limit) will appear as a bottleneck, provided the hardware allows a large number of concurrent connections?

Update: Q3 how exactly a higher connection count is negative to performance?

+1  A: 

Q2: You can have far more users on your web site than connections to your database because each user doesn't hold a connection open. Users only require a connection every so often and then only for a short time. Your web app connection pool will generally have far fewer than the 4096 limit.

Think of a restaurant analogy. A restaurant may have 100 customers (users) but only 5 waiters (connections). It works because customers only require a waiter for a short time every so often.

The time when it goes wrong is when all 100 customers put their hand up and say 'check please', or when all 16,000 users hit the 'submit order' button at the same time.

Qwerky
which is why I said that apache supports 20,000 _concurrent_ connections, and if this happens, mysql becomes a bottleneck
Bozho
A: 

From the link you provided to "MySQL administrator best practices"

"Note: connections take memory and your OS might not be able to handle a lot of connections. MySQL binaries for Linux/x86 allow you to have up to 4096 concurrent connections, but self compiled binaries often have less of a limit."

So 4096 seems like the current maximum. Bear in mind that the limit is per server and you can have multiple slave servers that can be used to serve queries.

http://dev.mysql.com/doc/refman/5.0/en/replication-solutions-scaleout.html

Jaydee
+3  A: 

Q1: you set a configuration paramter called max_connections. It can be set well above 4096, but you are definitely advised to keep it much lower than that for performance reasons.

Q2: you usually don't need that many connections, and things will be much faster if you limit the number of concurrent queries on your database. You can use something like pgbouncer in transaction mode to interleave many transactions over fewer connections.

Magnus Hagander
are you sure of this (Q1). Because the manual sounds quite clear. Also, what exactly is the reason for a performance hit if I increase the max_connections
Bozho
@Bozho: Magnus is talking about PostgreSQL. You could use more than 4096 connections for PostgreSQL, but you will see a huge performance hit. Each connection will run in it's own proces, you need a lot of processorcores for so many processes. You'd better use a connectionpool if you need so many concurrent connections.
Frank Heikens
connection pool where? On the client side, connection pooling is done by creating connections upfront.
Bozho
A: 

The Wikipedia Study Case

  • 30 000 HTTP requests/s during peak-time
  • 3 Gbit/s of data traffic
  • 3 data centers: Tampa, Amsterdam, Seoul
  • 350 servers, ranging between 1x P4 to 2x Xeon Quad- Core, 0.5 - 16 GB of memory
  • ...managed by ~ 6 people

This is a little bit off-topic of your questions. But I think you could find this useful. you don't always kick the DB for each request. a correct caching strategy is almost always the best performance improvement you can apply to your web app. lot of static content could remain in cache until it explicitly change. this is how Wikipedia does it.

SDReyes
of course, I will utilize caching. But I was just wondering whether I'm correct in the assumption that in case 20,000 _new_ (hence no-cacheable data) users come at a time, mysql will be the problem.
Bozho
@Bozho: Ok, I got it. this make it more offtopic : P. hope it would help you anyway : )
SDReyes
Yes, it is indeed useful information
Bozho