views:

638

answers:

4

I want to setup a statistics monitoring platform to watch a specific service, but I'm not quiet sure how to go about it. Processing the intercepted data isn't my concern, just how to go about it. One idea was to setup a proxy between the client application and the service so that all TCP traffic went first to my proxy, the proxy would then delegate the intercepted messages to an awaiting thread/fork to pass the message on and recieve the results. The other was to try and sniff the traffic between client & service.

My primary goal is to avoid any serious loss in transmission speed between client & application but get 100% complete communications between client & service.

Environment: UBuntu 8.04

Language: c/c++

In the background I was thinking of using a sqlite DB running completely in memory or a 20-25MB memcache dameon slaved to my process.

Update: Specifically I am trying to track the usage of keys for a memcache daemon, storing the # of sets/gets success/fails on the key. The idea is that most keys have some sort of separating character [`|_-#] to create a sort of namespace. The idea is to step in between the daemon and the client, split the keys apart by a configured separator and record statistics on them.

+1  A: 

Exactly what are you trying to track? If you want a simple count of packets or bytes, or basic header information, then iptables will record that for you:

iptables -I INPUT -p tcp -d $HOST_IP --dport $HOST_PORT -j LOG $LOG_OPTIONS

If you need more detailed information, look into the iptables ULOG target, which sends each packet to userspace for analysis.

See http://www.netfilter.org for very thorough docs.

Adam Liss
A: 

You didn't mention one approach: you could modify memcached or your client to record the statistics you need. This is probably the easiest and cleanest approach.

Between the proxy and the libpcap approach, there are a couple of tradeoffs:

- If you do the packet capture approach, you have to reassemble the TCP
  streams into something usable yourself. OTOH, if your monitor program
  gets bogged down, it'll just lose some packets, it won't break the cache.
  Same if it crashes. You also don't have to reconfigure anything; packet
  capture is transparent. 

- If you do the proxy approach, the kernel handles all the TCP work for
  you. You'll never lose requests. But if your monitor bogs down, it'll bog
  down the app. And if your monitor crashes, it'll break caching. You
  probably will have to reconfigure your app and/or memcached servers so
  that the connections go through the proxy.

In short, the proxy will probably be easier to code, but implementing it may be a royal pain, and it had better be perfect or its taking down your caching. Changing the app or memcached seems like the sanest approach to me.

BTW: You have looked at memcached's built-in statistics reporting? I don't think its granular enough for what you want, but if you haven't seen it, take a look before doing actual work :-D

derobert
The problem I'm trying to solve is to know wtf memcache and the application are doing "Are the keys expiring to soon and or are there more sets then gets"
David
Sets vs. gets is something the built-in statistics can answer.
derobert
A: 

iptables provides libipq, a userspace packet queuing library. From the manpage:

Netfilter provides a mechanism for passing packets out of the stack for queueing to userspace, then receiving these packets back into the kernel with a verdict specifying what to do with the packets (such as ACCEPT or DROP). These packets may also be modified in userspace prior to reinjection back into the kernel.

By setting up tailored iptables rules that forward packets to libipq, in addition to specifying the verdict for them, it's possible to do packet inspection for statistics analysis.

Another viable option is manually sniff packets by means of libpcap or PF_PACKET socket with the socket-filter support.

Nicola Bonelli
+1  A: 

If you want to go the sniffer way, it might be easier to use tcpflow instead of tcpdump or libpcap. tcpflow will only output TCP payload so you don't need to care about reassembling the data stream yourself. If you prefer using a library instead of gluing a bunch of programs together you might be interested in libnids.

libnids and tcpflow are also available on other Unix flavours and do not restrict you to just Linux (contrarily to iptables).

http://www.circlemud.org/~jelson/software/tcpflow/ http://libnids.sourceforge.net/

Krunch