views:

134

answers:

3

I have a system with a download.php page. The page takes and id and loads a file based on from the DB Record and then serves it up. I've noticed a couple instances where files are requested multiple times in short time spans (20ms). Times that are too quick for human input. There are plenty of instances where the downloader functions fine. However, in taking a closer look at the downloader’s usage, I did see some interesting behavior.

For instance, the IP address xxx.xxx.xxx.xxx (which is one in a range owned by xxxxxx.de in Germany) came to the site through Google. They browsed around and then came to the page http://site.com/xxxx/press+125.php There they issued a request for /download.php?id=/ZZ/n+aH55Y= (a PDF) at 9:04:23AM. That alone is not a big deal. However, what is interesting is that the server seems to have been quite preoccupied with serving that request. In the logs the request first completes between 9:09:48 and 9:10:00. It looks like the user must have gotten tired of waiting during that time and requested the document two more times. Between 09:14:47 and 09:15:00 the same request appears again, except it is from 9:04:43AM, 20ms later than the first request. Then it pops up a third time, with a request that started at 09:05:06 completing between 09:19:55 and 09:19:58!

I’m suspicious of that document. In looking through the logs I see other instances where it takes the server a little while to handle that specific file. Check out this list of requests from zzz.zzz.zzz.zzz[different than above] for the file /download.php?id=/ZZ/n+aH55Y= (the same docuemnt as before):

Request time Complete Time 04:32:43 04:33:36 04:32:50 04:33:36 04:32:51 04:33:38 04:33:05 04:33:38 04:33:34 04:33:42 04:33:05 04:33:42

So something is definitely going on. Whether it has to do with this specific document tripping up the server, the download.php page’s code, or if we’re just seeing the evidence of some server level overload as it plays out in real time I’m not yet sure.

In fairness, there are other instances of people downloading /download.php?id=/ZZ/n+aH55Y= (the same PDF) without error. However, it is interesting that the multiple processes only seem to happen with this one file, and then only when it is accessed through the page http://site.com/press+125.php . It bears further investigation if there’s something amiss inside the code that causes the system to fire off multiple download requests that occupy the server.

I don't know if this press+125.php is a rabbit hole, but there is weird consicence.

Any ideas? I'm totally out of ideas. Apache maxed out? Things like that.

///DOWNLOAD.php
$file = new files();
$file->comparison_filter("id", "=", $id); //sql to load
if ($file->load()) {
    $file->serve(); 
}


//FILES
function serve() {  
        if ($this->is_loaded) {
            if (file_exists($this->get_value("filename"))) {
                if ($this->get_value("content_type") != "") {
                    header("Content-Type: " . $this->get_value("content_type"));
                }       
                header("Content-Length: " . filesize($this->get_value("filename")));
                if ($this->get_value("flag_image") == 0 || $this->get_value("flag_image") == false) {
                    header("Cache-Control: private");
                    header("Content-Disposition: attachment; filename=" . urlencode($this->get_value("original_filename")));
                }

                set_time_limit(0);
                @readfile($this->get_value("filename"));

                exit;
            }
        }
}
A: 

Use a CDN network for file downloads. They will handle this for you, and plus provide you with bandwidth and scalability. No more lock ups on your server. http://www.reelseo.com/free-cdn-velocix/

Pentium10
CDN isn't an option. Some of the items are proprietary and confidential. I stripped out some code that detects authentication. Plus, there's an admin in place to upload the files.
easement
A: 
  1. Have you analyzed User-Agent and Referer headers in HTTP request?
  2. Why not serving all static files from apache or whatever you have? If you want to track download stats you can do a redirect from you script to static file.
Yaroslav
There's some authentication tracking. There's a hit to the DB that looks up country and region and then writes it to a log file. We need to capture who is downloading the files as well.
easement
A: 

Add '%D %X' to your logging config - I expect that will answer many of your questions.

C.

symcbean