So, for a CS project I'm supposed to sniff a network stream and build files from that stream. For example, if the program is pointed to ~/dumps/tmp/ then the directory structure would be this:
~/dumps/tmp /192.168.0.1/ page1.html page2.html [various resources for pages1 & 2] downloaded file1 /192.168.0.2/ so on and so forth.
I'm doing this in C & pcap on linux (since I already know C++, and figure the learning experience would be good).
Thus far, I've been looking at various header formats for TCP/IP
As I figure, I can sort the packets by their dst/src and then order them correctly by sequence and acknowledgement windows.
But that leaves me with a big ? as to how do I figure out how packets a-z are part of an html file and A-Z part of some random file being downloaded etc?
Also, what other kind of header formats should I be looking up? Currently, I have:
I'd post more hyperlink pictures, but I apparently need reputation to do that, sorry TCP, Ethernet, UDP, and I'll get around to things like FTP (but I'm pretty sure FTP is built on top of TCP, as is HTTP)
So, in short, how do I find files in a network stream, and am I missing any major protocols that I'll need to be able to read?
REPLY I can't figure out how to reply, so this will have to do.
I have used pcap on several occasions, and will do so again for this project, but I won't use any of Wiresharks stuff (although it is a great program) because I want to no kidding learn this kind of stuff.
Yeah, I'll look into the OSI layer, any suggestions on a good site that covers common protocols?
And I guess I should stop, before this 'question' becomes a discussion.