For learning purposes I've written a simple TCP proxy in Erlang. It works but I experience an odd performance fall-off when I use ab (Apache Bench) to make many concurrent requests. It's not the performance fall-off per se that makes me wonder but the scale of the fall-off. The backend is nginx as a web server. My proxy sits inbetween ab and nginx.
This is the code of my proxy.
-module(proxy).
-export([start/3]).
start(InPort, OutHost, OutPort) ->
{ok, Listen} = gen_tcp:listen(InPort, [binary, {packet, 0}, {active, once}]),
spawn(fun() -> connect(Listen, OutHost, OutPort) end).
connect(Listen, OutHost, OutPort) ->
{ok, Client} = gen_tcp:accept(Listen),
spawn(fun() -> connect(Listen, OutHost, OutPort) end),
{ok, Server} = gen_tcp:connect(OutHost, OutPort, [binary, {packet, 0}, {active, once}]),
loop(Client, Server).
loop(Client, Server) ->
receive
{tcp, Client, Data} ->
gen_tcp:send(Server, Data),
inet:setopts(Client, [{active, once}]),
loop(Client, Server);
{tcp, Server, Data} ->
gen_tcp:send(Client, Data),
inet:setopts(Server, [{active, once}]),
loop(Client, Server);
{tcp_closed, _} ->
ok
end.
Firing a 64 sequential requests at my proxy I get a very good result.
ab -n 64 127.0.0.1:80/
Concurrency Level: 1
Time taken for tests: 0.097 seconds
Complete requests: 64
Failed requests: 0
Write errors: 0
Total transferred: 23168 bytes
HTML transferred: 9664 bytes
Requests per second: 659.79 [#/sec] (mean)
Time per request: 1.516 [ms] (mean)
Time per request: 1.516 [ms] (mean, across all concurrent requests)
Transfer rate: 233.25 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.3 0 1
Processing: 1 1 0.5 1 2
Waiting: 0 1 0.4 1 2
Total: 1 1 0.5 1 2
Percentage of the requests served within a certain time (ms)
50% 1
66% 2
75% 2
80% 2
90% 2
95% 2
98% 2
99% 2
100% 2 (longest request)
It's just a little slower than using Apache Bench directly against nginx.
But firing 64 concurrent requests at the proxy the performance drops crazy
ab -n 64 -c 64 127.0.0.1:80/
Concurrency Level: 64
Time taken for tests: 2.011 seconds
Complete requests: 64
Failed requests: 0
Write errors: 0
Total transferred: 23168 bytes
HTML transferred: 9664 bytes
Requests per second: 31.82 [#/sec] (mean)
Time per request: 2011.000 [ms] (mean)
Time per request: 31.422 [ms] (mean, across all concurrent requests)
Transfer rate: 11.25 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 31 121.7 0 501
Processing: 3 1135 714.4 1001 2006
Waiting: 3 1134 714.3 1000 2005
Total: 3 1167 707.8 1001 2006
Percentage of the requests served within a certain time (ms)
50% 1001
66% 1502
75% 2003
80% 2004
90% 2005
95% 2005
98% 2005
99% 2006
100% 2006 (longest request)
What/where is the problem? I expected a lower performance but why this much? Look at the requests per second!
It doens't seem to matter much wether I give erl a lot of threads using +A. I even tried SMP but the results are almost the same.
My set up: Windows 7 64, Intel QuadCore, 8GB RAM. I get similar results on Ubuntu using 128 concurrent requests.
EDIT: Included new insight. The total count of requests doesn't matter. It's just the count of concurrent requests.