views:

84

answers:

6

What is the best character to use to delimit user input?

For example if a user has an infinite number of textboxes to type things into, but each textbox's value will be concatenated into a single database field, what is the safest character to delimit each input?

I think it should be a character not on your typical keyboard. Is there a character out there just for this?

+1  A: 

Normally commas or semi-colons are used for splitting data. What about | which the average user never uses?

James Brooks
But it is possible for all of those characters to be typed by the user. I have thought about a pipe or 2 pipes, but I would like to find something that couldn't be typed.
Ronnie Overby
Well anything can be typed. You could use some jQuery or Javscript to block that character from being entered.
James Brooks
True. But I can still outrun the average goat.
Ronnie Overby
Which is exactly what the | (pipe) will do.
James Brooks
It all becomes so clear.
Ronnie Overby
+8  A: 

You could use one of the ASCII control characters. There's one called "Record Separator" which has a hex value of 0x1E that might fit your needs.

Edit: Incidentally, if you want to do a proper job, you should probably ensure that \x1E is escaped in user input. One way to do this would be to use another ASCII control character: \x1B which is the "escape" control code. Thus, "\x1E" in input becomes "\x1B\x1E" and "\x1B" becomes "\x1B\x1B".

Keep in mind, of course, that because these are non-printing control codes, they can't be displayed. If you want a printable representation, you might want to go with a normal character like the comma and just escape it from input.

DK
What about for Unicode?
Ronnie Overby
What about it? ASCII occupies the first 128 code points of Unicode, meaning that any ASCII character is the same in Unicode.
DK
Oh yeah. Duh .
Ronnie Overby
+1 for advocating escaping the sequence. ANYTHING can be typed if you're not validating your input to remove it. Using an escape sequence is the only safe way. But that said, the right way to do this is to not try to store multiple values in one DB column... why don't you create a table for them and make them one row per value?
rmeador
A: 

How about a combination of keys? e.g.

|::|

so

this|::|and|::|that. Plus Those:Here and there.|::|Even this|that works
scunliffe
It can be typed.
Ronnie Overby
Oh for sure... but the likelyhood is much less than a single character. e.g. pipe. The "Record Separator" is likely the way to go.
scunliffe
+2  A: 

I guess one approach is to use a comma, and then to escape commas within the user input. It's probably not safe to assume any character (or even a sequence of characters) can't appear in user input -- if you can enter it in your code, then there's a way the user can enter it into a text box!

Ben
A: 

Any markup language will do for this. They're a little verbose but at least they'll be future proofing your field.

rein
A: 
Dmitri Farkov
ftw ?
Ronnie Overby
ftw stands for 'For the win' :)
Dmitri Farkov