tags:

views:

106

answers:

2

The goal is to mine packet headers for URLs visited using tcpdump.

So far, I can save a packet header to a file using:

tcpdump "dst port 80 and tcp[13] & 0x08 = 8" -A -s 300 | tee -a ./Desktop/packets.txt

And I've written a program to parse through the header and extract the URL when given the following command:

cat ~/Desktop/packets.txt | ./packet-parser.exe

But what I want to be able to do is pipe tcpdump directly into my program, which will then log the data:

tcpdump "dst port 80 and tcp[13] & 0x08 = 8" -A -s 300 | ./packet-parser.exe

Here is the script as it is. The question is: how do I need to change it to support continuous input from tcpdump?

#include <boost/regex.hpp>
#include <fstream>
#include <cstdio> // Needed to define ios::app
#include <string>
#include <iostream>

int main()
{
    // Make sure to open the file in append mode
    std::ofstream file_out("/var/local/GreeenLogger/url.log", std::ios::app);
    if (not file_out)
        std::perror("/var/local/GreeenLogger/url.log");
    else
    {
        std::string text;
        // Get multiple lines of input -- raw
        std::getline(std::cin, text, '\0');
        const boost::regex pattern("GET (\\S+) HTTP.*?[\\r\\n]+Host: (\\S+)");
        boost::smatch match_object;
        bool match = boost::regex_search(text, match_object, pattern);
        if(match)
        {
            std::string output;
            output = match_object[2] + match_object[1];
            file_out << output << '\n';
            std::cout << output << std::endl;
        }
        file_out.close();
    } 
}

Thank you ahead of time for the help!

A: 

Edit: Sorry, this is completely wrong. See new answer.

I haven't tried it, but this will probably work:

while(std::getline(std::cin, text, '\0')) {
  // ...
}
file_out.close();
dave
+1  A: 

You're going to have to make some bigger changes to get it to work. Your getline delimited by '\0' won't terminate until it either sees a '\0' (which isn't going to happen) or the input reaches EOF. That's why it works when you use a file (which eventually reaches EOF) but not with streaming straight from tcpdump.

I'm having trouble compiling with Boost, so I'm still not totally sure this will work. But you should get the idea. [Edit: It works :-)]

 std::string text, line1, line2;

 std::getline(std::cin, line1);
 while(std::getline(std::cin, line2)) {
   text = line1 + '\n' + line2;
   // ...
   line1 = line2;
 }
 file_out.close();
dave
dave, thanks a lot. This worked perfectly. I also had to add: file_out.flush(); after writing to the output file because the file never actually gets closed if I kill the process from the terminal (and thus never gets updated). I'm new to C++ and I have no formal programming education so I doubt I'd have ever figured that out. Thanks a lot!
GreeenGuru