Is there any modern review of solutions to the 10000 client/sec problem

views:

661

answers:

+21 Q:

Is there any modern review of solutions to the 10000 client/sec problem

(Commonly called the C10K problem)

Is there a more contemporary review of solutions to the c10k problem (Last updated: 2 Sept 2006), specifically focused on Linux (epoll, signalfd, eventfd, timerfd..) and libraries like libev or libevent?

Something that discusses all the solved and still unsolved issues on a modern Linux server?

+5 A:

Coincidentally, just a few days ago, Programming Reddit or maybe Hacker News mentioned this piece:

Thousands of Threads and Blocking IO

In the early days of Java, my C programming friends laughed at me for doing socket IO with blocking threads; at the time, there was no alternative. These days, with plentiful memory and processors it appears to be a viable strategy.

The article is dated 2008, so it pulls your horizon up by a couple of years.

Carl Smotricz 2010-07-31 16:56:22

I'm sure your hardware vendor is happy.

ninjalj 2010-07-31 17:19:10

I'm more concerned with making damjan.mk happy. But please don't misconstrue my comment: This approach runs fine on a run-of-the-mill store bought PC, which is hard these days to find with less than a dual core CPU and 2G of RAM.

Carl Smotricz 2010-07-31 20:00:51

+1 A:

libev runs some benchmarks against themselves and libevent...

rogerdpack 2010-08-01 02:25:23

+3 A:

The C10K problem generally assumes you're trying to optimize a single server, but as your referenced article points out "hardware is no longer the bottleneck". Therefore, the first step to take is to make sure it isn't easiest and cheapest to just throw more hardware in the mix.

If we've got a $500 box serving X clients per second, it's a lot more efficient to just buy another $500 box to double our throughput instead of letting an employee gobble up who knows how many hours and dollars trying to figure out how squeeze more out of the original box. Of course, that's assuming our app is multi-server friendly, that we know how to load balance, etc, etc...

joe snyder 2010-08-04 19:33:02

What if someone wants to write a high-performance library to save your money and possibly of thousands others?

jweyrich 2010-08-05 00:28:11

joe snyder 2010-08-05 19:45:02

+4 A:

To answer OP's question, you could say that today the equivalent document is not about optimizing a single server for load, but optimizing your entire online service for load. From that perspective, the number of combinations is so large that what you are looking for is not a document, it is a live website that collects such architectures and frameworks. Such a website exists and its called www.highscalability.com

Side Note 1:

I'd argue against the belief that throwing more hardware at it is a long term solution:

Perhaps the cost of a performance engineer is low compared to the cost of a single server. What happens when you scale out? Lets say you have 100 servers. A 10 percent improvement in server capacity can save you 10 servers a month. Thats more than what the performance engineer costs you.
Even if you have just two machines, you still need to handle performance spikes. The difference between a service that degrades gracefully under load and one that breaks down is that someone spent time optimizing for the load scenario.

Side note 2:

The subject of this post is slightly misleading. The CK10 document does not try to solve the problem of 10k clients per second. (The number of clients per second is irrelevant unless you also define a workload along with sustained throughput under bounded latency. I think Dan Kegel was aware of this when he wrote that doc.). Look at it instead as a compendium of approaches to build concurrent servers, and micro-benchmarks for the same. Perhaps what has changed between then and now is that you could assume at one point of time that the service was for a website that served static pages. Today the service might be a noSQL datastore, a cache, a proxy or one of hundreds of network infrastructure software pieces.

tholomew 2010-08-06 13:07:13

Couldn't agree more with your arguments. The title was edited according to the comments above - IMO, badly. I liked your reference, but I'm afraid I won't be able to read much of it in time. +1.

jweyrich 2010-08-07 05:34:13

I agree with your points, but you need to update your break even point for a performance engineer. A reasonable server (8 core, 7 GB RAM) costs US$0.68 per hour on Amazon's EC2. Cutting 10 servers only saves you $60K per year. That won't get you much of a performance engineer.

Ken Fox 2010-08-13 19:53:02

+1 A:

I'd recommend Reading Zed Shaw's poll, epoll, science, and superpoll[1]. Why epoll isn't always the answer, and why sometimes it's even better to go with poll, and how to bring the best of both worlds.

[1] http://sheddingbikes.com/posts/1280829388.html

racetrack 2010-08-06 20:09:14

@shadowfax: I can't confirm his results right now, but I believe it's true since each call to poll requires much more data (all events you're interested in) to be transferred between user and kernel-space. The compensation should start when more of these events are active, and more new events are occurring (read higher load). I should say I don't like the "superpoll" approach, as it adds lots of unnecessary syscalls for not being a kernel implementation. Anyway, the article gave me good insights. +1.

jweyrich 2010-08-07 05:26:55

@jweyrich: Did you see this? http://sheddingbikes.com/posts/1280882826.html He has provided the full C code, as well as the R environment of his tests, might help experimenting this on your own

racetrack 2010-08-07 08:01:46

@shadowfax: yes, thanks. I'll be testing it when I get some free time. More here: http://sheddingbikes.com/posts/1281174543.html

jweyrich 2010-08-07 16:32:05

Have a look at the RamCloud project at Stanford: http://fiz.stanford.edu:8081/display/ramcloud/Home

Their goal is 1,000,000 RPC operations/sec/server. They have numerous benchmarks and commentary on the bottlenecks that are present in a system which would prevent them from reaching their throughput goals.

Noah Watkins 2010-08-06 21:53:55

They should change their goal to 1. Or there's more people trying to access it? :-(

jweyrich 2010-08-07 04:53:58

I don't follow. What are you trying to say? I'm saying 1M serviced requests independent of the number of client.

Noah Watkins 2010-08-07 05:57:14

@Noah: I meant the link you posted is inaccessible for me. Is it responding your requests? Timeouts here.

jweyrich 2010-08-07 06:18:44

Oh, haha...Yeh, it's summer time at Stanford. No sys admins :P

Noah Watkins 2010-08-07 07:26:20

ansaurus

tags:

views:

answers:

Is there any modern review of solutions to the 10000 client/sec problem

related questions