tags:

views:

324

answers:

2

I'm writing a script that uploads a file to a cgi script that expects a multipart request, such as a form on a HTML page. The boundary is a unique token that annotates the file contents in the request body. Here's an example body:

--BOUNDARY
Content-Disposition: form-data; name="paramname"; filename="foo.txt"
Content-Type: text/plain

... file contents here ...
--BOUNDARY--

The boundary cannot be present in the file contents, for obvious reasons.

What should I do in order to create an unique boundary? Should I generate a random string, check to see if it is in the file contents, and if it is, generate a new, rinse and repeat, until I have a unique string? Or would a "pretty random token" (say, combination of timestamp, process id, etc) be enough?

A: 

If you are feeling paranoid, you can generate a random boundary and search for it in the string to be sent, append random char on find. But my experience is any arbitrary non-dictionary string of 10 or so characters is about impossible to occur, so picking something like ---BOUNDARY---BOUNDARY---BOUNDARY--- is perfectly sufficient.

SF.
No, it is not sufficient. Because you won't be able to send your program source code (or this comment) using your program.
stepancheg
@stepancheg: It seems you are feeling paranoid, in this case use the solution from the first paragraph of my answer. If you are mentally healthy though, use `Content-Encoding: gzip` and stop worrying about users out there trying to get you.
SF.
+2  A: 

If you use something random enough like a GUID there shouldn't be any need to hunt through the payload to check for an alias of the boundary. Something like:-

----=NextPart_3676416B-9AD6-440C-B3C8-FC66DDC7DB45
Header:....

Payload
----=NextPart_3676416B-9AD6-440C-B3C8-FC66DDC7DB45--

AnthonyWJones
Thanks! Your answer is just as good as the tagged answer, but he needed the rep more than you did ;)
August Lilleaas