views:

106

answers:

1

I have a web app that sends messages to an Amazon SQS Queue. Amazon sqs lib throws a 'AmazonSQSException' since the message contained invalid binary character. The message is the referrer obtained from an incoming http request. This is what it looks like:

http://ads.vrx.adbrite.com/adserver/display_iab_ads.php?sid=1220459&title_color=0000FF&text_color=000000&background_color=FFFFFF&border_color=CCCCCC&url_color=008000&newwin=0&zs=3330305f323530&width=300&height=250&url=http%3A%2F%2Funblockorkutproxy.com%2Fsearch.php%2FOi8vZG93%2FbmxvYWRz%2FLnppZGR1%2FLmNvbS9k%2Fb3dubG9h%2FZGZpbGUv%2FNTY5MTQ3%2FNi9NeUN1%2FdGVHaXJs%2FZnJpZW5k%2FWmFoaXJh%2FLndtdi5o%2FdG1s%2Fb0%2F>^Fô}úÃ<99>ë)j

Looks like the characters in bold are the invalid characters. Is there an easy way to filter out characters characters that are not accepted by amazon ?

Here are the characters allowed by amazon in message body. I am not sure what regex i should use to replace invalid characters by ''

+1  A: 

It depends on what programming language you're using. For example, several programming languages would permit you to directly translate the Amazon specification you linked to into a regular expression meaning "one or more characters not in the permitted ranges".

For example, Perl:

referer =~ s/[^\x{9}\x{A}\x{D}\x{20}-\x{D7FF}\x{E000}-\x{FFFD}\x{10000}-\x{10FFFF}]+//g;

Jonathan Feinberg