views:

108

answers:

2

Hello,

I have a file hosting site and users earn a reward for downloads. So I wanted to know is there a way I can track whether the visitor downloaded whole file so that there are no fake partial downloads just for rewards.

Thank You.

+1  A: 

Hi,

I implemented a similar solution on a file hosting website.

What you want to do is use the register_shutdown_function callback that allows you to detect the end of execution of a php script regardless of the outcome.

Then you want to have your file in a non-web accessible location on your server(s), and have your downloads go through php: idea being that you want to be able to track how many bytes have been passed to the client.

Here's a basic way of implementing (eg:)

<?php
register_shutdown_function('shutdown', $args);

$file_name = 'file.ext';
$path_to_file = '/path/to/file';
$stat = @stat($path_to_file);

//Set headers
header('Content-Type: application/octet-stream');
header('Content-Length: '.$stat['size']);
header('Connection: close');
header('Content-disposition: attachment; filename='.$file_name);

//Get pointer to file
$fp = fopen('/path/to/file', 'rb');
fpassthru($fp);

function shutdown() {
  $status = connection_status();

  //A connection status of 0 indicates no premature end of script
  if($status == 0){
    //Do whatever to increment the counter for the file.
  }
}
>?

There are obviously ways to improve, so if you need more details or another behaviour, please let me know!

Maurice Kherlakian
This will kill apache with big files...
DaNieL
Yes you're right, it does depend greatly on the site's usage. On our application we used to pipe all file downloads through php because we used to also authenticate the user, and throttle the bandwidth. Point being that we had dedicated servers for serving those files, feeding off backend storage servers running lighttpd.We had reduced Apache's footprint by custom compiling and disabling all but the essential modules. We were able to handle about 130 simultaneous downloads/server over 35 front ends, for a total throughput of 2.3Gbps...
Maurice Kherlakian
+1  A: 

If you could monitor the HTTP response codes returned by your web server and tie them back to the sessions that generated them, you would be in business.

A response code of 206 shows that the system has delivered some of the information but not all of it. When the final chunk of the file goes out, it should not have a response code of 206.

If you can tie this to user sessions by putting the session code inside the URL, then you could give points based on a simple log aggregation.

TheJacobTaylor
Can you please show me some example of its usage?
Shishant
I don't have any Apache logs handy. Do you happen to have a snippet of a file transfer from your logs that you can post? Filter it by one file and one IP address to get just the relevant entries. Depending on your log format, it should be relatively easy to find the entry that is the last page of the file.
TheJacobTaylor