tags:

views:

313

answers:

2

Dear all, I am trying my hands understanding PCAP libraries. I am able to apply a filter and get the TCP payload at port 80. But what next ? How can I read the HTTP data - suppose I want to know the "User Agent" field value in the http header..how should I proceed ? I have searched the website (and googled a lot too), and could find a related thread here : http://stackoverflow.com/questions/2073183/writing-a-http-sniffer. But this doesn't help me anywhere...

Thanks !!

+1  A: 

First, you should know that PCAP give you packets, and will not reconstruct the TCP stream so you won't be able to read full HTTP TCP streams without first reconstructing the data.

Assuming all the data is available in one packet try and look at my answer for a similar question. All you need to do different is to parse the HTTP header and get the user agent.

If you don't limit yourself to C, and if you can use Windows, you can write a .NET application and use Pcap.Net to parse Ethernet, IPv4 and TCP perfectly.

brickner
Thank you so much Brickner !! I was exactly looking for this - processing the tcp payload (but I prefer C for learning). Can you please explain this line of your code : /* start of url - skip "GET " */ url = tcpPayload + 4; Why 4 ? And what should I explore to get more of such values (I read RFC for HTTP, but couldnt understand how to use it). I actually want to read all the HTTP data in that packet. Is it only possible if I reconstruct the full stream ? If yes, can you give me some idea over this reconstruction ? Thanks a lot !!
Ishi
4 is for the "GET " - 3 ASCII characters + space. The RFC includes all of the possible requests names (like GET, POST...).If you only want the HTTP data in the single packet, then no reconstruction is needed. If you want the entire HTTP request, you might need to reconstruct the TCP stream (if the request is more than 1 packet). TCP reconstruction is another (and pretty complicated) issue and you should Google for it or open a different question. By the way, are you using LibPcap or the Windows wrapper WinPcap or is your question more generic?
brickner
I am using libpcap on linux.I was aiming at capturing http packets and extracting some useful information, but reconstruction is undoubtedly too difficult for now. Should I target FTP, which contains just a code and message - is it feasible ? For eg. I would like to extract the username and password from FTP messages.
Ishi
I don't understand how FTP is related? FTP also needs to be reconstructed in order to fully parse it.
brickner
I came to read FTP sniffing is easier compared to HTTP, because every packet contains just a message and the data. I agree FTP too will have a reconstruction issue, but I think it would be easier. Is it ?
Ishi
The TCP reconstruction won't be easier. The FTP protocol itself is simple, but I'm not sure what you want to extract from it. It has different parameters and values than HTTP so you won't be able to get the same fields.
brickner
+1  A: 

Why don't you use a Wireshark Dissector?

Jay