tags:

views:

69

answers:

1

Is this possible to do? I'm currently coding in PHP using cURL library but this rather applies to HTTP as a whole.

The most obvious way sounded like throwing a HEAD request to the data URL and read its Content-Length header, but the problem is that some servers including apache 2.0 does not send Content-Length against HEAD requests and since its not mandatory, there is no guarantee that all servers out there will reply with such information even on GET request.

I'm making the server download web pages specified by user input and store it on the server, but I do not want to let it download any requests only to find the file too large to be discarded after everything is downloaded to choke on the bandwidth from malicious requests. So I want to know the size of the content before the data is actually transfered, and reliably.

Cases of malicious web servers sending wrong Content-Length and those minor weird occasions do not concern me, if it works for all of the rest of general cases.

The worst idea so far in my mind is to actually just download the content with GET request and just drop the connection if it exceeds the size limit specified during the transfer, but this sounds like a very ugly solution on such a general protocol as HTTP.

Does anyone have any better ideas?

+3  A: 

No, servers don't have to tell you the size of resource they're about to serve you because they may not have the knowledge themselves. So no, there's no universal way, but yes you can try looking up the Content-length header whenever it is provided.

Michael Krelin - hacker
That is not a reliable way. Seems cutting down the connection once it exceeds the given size is the only way.
Yes, it's not, that's why *whenever it is provided* part. Actually, yes, even when it is provided it may only be of help for estimation, as nothing prevents the server from advertising 10 bytes content and feeding client the whole content of `/dev/urandom`.
Michael Krelin - hacker