views:

530

answers:

13

If you exposing randomly generated strings or strings with data encoded in them (Product keys). Is it irrational to sanitize them for curse words to avoid the client possibly getting offended in the rare case an offensive word is generated.

Anybody ever have a customer get offended by a randomly generated curse word? Anybody out there ever code logic to filter them out?

Thanks

Edit

One time after developing a product key generation system which had customer data encoded into it. As a joke we wrote a program to see what customer input would generate funny words.

+20  A: 

Yes, on the grounds that anyone who would be offended by something they saw in a randomly generated string can think of more things they find offensive than you can sanitise.

Don't optimise for the insane.

annakata
+1 for the last sentence! love it.
Brian Postow
+10, if I could.
Tomalak
Don't optimise for the insane ... unless that's your primary market group. I've always found the sanity of the general public to be questionable.
The Digital Gabeg
They're not insane. They just have what I like to think of as clbuttical consbreastution.
David Berger
Managed to catch a random downvote here, so it's not universal, but seriously you may as well take offence at cloud formations or burn patterns on toast. Humans, eh? Bring on the Vogons I say.
annakata
@David: LMFAO
annakata
You guys are turning SO into Reddit.
Steve Kuo
Thanks for registering! Your activation code is THEHOLOCAUSTNEVERHAPPENED and your password is N*GGER666. We look forward to serving you!
Barry Fandango
I voted down because perception is reality. Customers may be insane, but you want to keep them as your customer.
Jeffrey Hines
@Jeffrey: thanks for explaining. I take your point, but what I'm saying is that the capacity for offence is greater than the practicable ability to defend against it. As S.Lott said, where do you draw a line?
annakata
@annakata Obviously you have to consider the effort involved to protect against this issue and weigh it against the other tasks that you can perform instead. But my solution stated below should be quick to implement.
Jeffrey Hines
-1 it's not irrational. All it takes is for one code to be generated like @Barry's example and then one person to post a picture online and you're sol.
dotjoe
@JEffrey - yeah it's a good enough idea at core, but notice how it's immediately insufficient w.r.t. 0, 1, V, 666, etc.. It doesn't address the potential for offence, and I just can't get past this being a reasonable use of time.
annakata
Profanities have come up in randomly generated strings before, and it's resulted in bad PR: http://www.clickondetroit.com/news/4050844/detail.html
Frank Farmer
I also -1ed for the reason given by Jeffrey Hines. I *deplore* the fact that some people look for any opportunity to get offended, but it's a practical reality of doing business. We have to operate with the world as it is, not how we would like it to be. :(
j_random_hacker
A: 

It's certain conceivable, but I wouldn't devote much time to it, especially if you've got letters and numbers.

CookieOfFortune
+1  A: 

Limit your randomly generated "words" to hex characters and I don't believe you'll have any English-language curses. This also pushes you down a path of not spending too much time on your random word generator.

Of course, there may be some language where you can curse with hex digits, but then you're not likely to know/filter those curses anyway.

kdgregory
Well, fecc e00 2.
chaos
0xdeadbeef, 0xaffe (german for "monkey").
David Schmitt
dead beef?!?! As a vegetarian I'm HIGHLY offended.
Aardvark
+24  A: 

Don't generate random strings with vowels and then you don't have to worry about curse words.

Jeffrey Hines
Great idea. Should work in most languages as well.
Laserallan
While I generally agree, you might still end up with strings like "fck" or "fvck". This probably still falls under "don't optimize for the insane."
luke
@luke: Yeah, I think that this method will get rid of the "real" curse words. Imagined or "kinda looks like a curse word if you squint really hard" words are still the user's problem :)
rmz
Using guids likely will prevent any of these issues.
Nate Bross
@luke Good point, but people who want to see things to be offended by will see something no matter what you do.
Jeffrey Hines
Should Y be considered a vowel?
Ryu
Oh and for the love of Mike, please don't put in characters that render similarly (ie, O and 0, I and 1, p and q, d and b, l and 1, l and I). Use only 1 from each pair.
plinth
@plinth Actually create an array of acceptable characters and exclude vowels and characters that can be misinterpreted, then randomly pick from that array.
Jeffrey Hines
@plinth Also a lot depends on the font you print or display these codes in. Using a computer type font will clearly indicate the character usually.
Jeffrey Hines
Yeah, fonts meant to use in consoles typically are very carefully designed so that '1', 'I', 'l', and '|' are easily recognizable; these are the same fonts that put a slash through the '0'.
Adam Jaskiewicz
This is much better than having ONLY vowels, because you get so many Hawaiian curse words with all vowels.
Nosredna
+4  A: 

That makes sense to me. I mean, it would be a pretty bad PR disaster if someone posts a picture of your product, with this stamped on the back of the CD case:

12345-67890-F**KU-ABCDE-FGHIJ

It sounds funny but you never know what kind of sense of humor the person will have who happens to pick up that package.

The Digital Gabeg
This isn't just a hypothetical, either. It really happened, with a cabbage patch doll: http://www.clickondetroit.com/news/4050844/detail.html
Frank Farmer
+1  A: 

If you are just worried about product keys, I would stick to hexadecimal digits, maybe even a guid would work for you. Probably no chance of a "naughty" word being generated with these constraints. You could also just stick to numbers as well. If you must have random strings with all letters of the alphabet, it is probably better safe than sorry so I would do the filtering.

John JJ Curtis
+1  A: 

See those items tagged with clbuttic

Cade Roux
A: 

I'm using randomly generated, phonetic-sounding passwords for one webapp I wrote. I did end up hard-coding a list of "dirty" words that aren't acceptable, but the list that matched my pattern ended up being pretty short.

Marc Charbonneau
A: 

No. You have no chance to collect all curse words in all world languages. Those words usually don't appear in dictionaries.

Dev er dev
A GOOD dictionary will include colloquialisms and slang. The OED certainly includes swears in it, and all other sorts of filth my ancestors would be ashamed to admit they did on a weekly basis.
Chris Kaminski
+4  A: 

Simplest solution is to generate from a 'sanitized' alphabet; use a set of characters that cannot possibly form words. One suggestion in one of the answers is hexadecimal which is an excellent choice, or otherwise drop some critical letters from the alphabet.

Note that just dropping vowels is not going to do the job... it is all too easy to infer them from the remaining consonants.

jerryjvl
+2  A: 

I think it's better to plainly avoid vowels. A product key like JKL-YOUAREMYFRIEND-0001-KK may not be offensive but it doesn't sound like serious business either.

tekBlues