views:

234

answers:

2

I have a groovy/grails application that needs to serve images

It works fine on my dev box, the image is returned properly. Here's the start of the returned JPEG, as seen by od -cx

0000000  377 330 377 340  \0 020   J   F   I   F  \0 001 001 001 001   ,
             d8ff    e0ff    1000    464a    4649    0100    0101    2c01

but on the production box, there's some garbage in front, and the d8ff e0ff before the 1000 is missing

0000000    �  **  **   �  **  **   �  **  **   �  **  **  \0 020   J   F
             bfef    efbd    bdbf    bfef    efbd    bdbf    1000    464a
0000020    I   F  \0 001 001 001  \0   H  \0   H  \0  \0   �  **  **   �
             4649    0100    0101    4800    4800    0000    bfef    efbd

It's the exact same code. I just moved the .war over and run it on a different machine. (Isn't Java supposed to be write once, run everywhere?)

Any ideas? An "encoding" problem?

The code is sent to the response like this:

   response.contentType = "image/jpeg"; response.outputStream << out;

Here's the code that locates the image on an internal application server and re-serves the image. I've pared down the code a bit to remove the error handling, etc, to make it easier to read.

def show = {
    def address = "http://internal.application.server:9899/img?photoid=${params.id}"

    def out = new ByteArrayOutputStream()
    out << new URL(address).openStream()    

    response.contentLength = out.size();

    // XXX If you don't do this hack, "head" requests won't work!
    if (request.method == 'HEAD')
    { render( text : "", contentType : "image/jpeg" ); }
    else {
        response.contentType = "image/jpeg"; response.outputStream << out;
    }
}

Update: I tried setting the CharacterEncoding

response.setCharacterEncoding("ISO-8859-1");

if (request.method == 'HEAD')
{ render( text : "", contentType : "image/jpeg" ); }
        else {
            response.contentType = "image/jpeg;charset=ISO-8859-1";  response.outputStream << out;
 }

but it made no difference in the output. On my production machine, the binary bytes in the image are re-encoded/escaped as if they were UTF-8 (see Michael's explanation below). It works fine on my development machine.

+3  A: 

An "encoding" problem?

Absolutely. The sequence "bfef efbd bdbf bfef efbd bdbf" is actually 4 repeats of (little-endian) UTF-8 for the U+FFFD REPLACEMENT CHARACTER code point. So at some point, your binary data is being interpreted as UTF-8 character data, and of course it's not valid UTF-8.

Almost certainly your production box uses UTF-8 as platform default encoding while the dev box uses a bijective ISO-8859 encoding.

But the problem here is not the use of the platform default encoding. The problem is that your binary data is converted to character data and back. And that's almost certainly the fault of your code. How do you read the images / create and fill the out variable?

EDIT: Looking at the code, there doesn't seem to be anything obviously wrong. But I'm a bit suspicious of those shift operators and Groovy's type handling and implicity conversions in regard to the overloaded leftShift() method of OutputStream. To pinpoint the problem, try looking at the contents of the ByteArrayOutputStream as well as reading the first bytes directly from the app server, to see where exactly things go wrong.

Or maybe the problem is further down the line - IIRC, groovy uses sitemesh to provide modular layouts. Perhaps that's the culprit, trying to parse the controller's output as HTML. Not sure how to switch it off, though.

Michael Borgwardt
Thanks, Michael!I'll update the question to give more of a code snippet so you can see what I'm doing.
ראובן
Sitemesh only interferes if the contentType is text/html. See web-app/WEB-INF/sitemesh.xml to configure what contentType Sitemesh kicks on.
Colin Harrington
I wrote a little "dump" action to print out the first 8 bytes of the stream: def b = out.toByteArray() render sprintf("%02x, %02x, %02x, %02x, %02x, %02x, %02x, %02x\n", b[0], b[1], b[2], b[3], b[4], b[5], b[6], b[7]) }They were as expected, the first 8 bytes of a JPEG header. So the data going in via the << operator is OK....
ראובן
Colin: I don't think it's sitemesh. It looks to be configured to onlyu parse pages with content-type "text/html". And I know the content type I set for my image is processed, because firebug shows the content-type as image/jpeg
ראובן
A: 

I fixed it!

Many thanks to Michael Borgwardt who got me pointed in the right direction.

I changed this:

if (request.method == 'HEAD')
{ render( text : "", contentType : "image/jpeg" ); }
else {
    response.contentType = "image/jpeg"; response.outputStream << out;
}

to this:

if (request.method == 'HEAD')
{ render( text : "", contentType : "image/jpeg" ); }
else {
    response.contentType = "image/jpeg"; response.outputStream << out.toByteArray()
}

note the "toByteArray()") That prevented groovy/grails/java/spring/hibernate/tomcat or whatever gets in the way from deciding to re-encode my binary data.

ראובן
Ah, so it looks as though the overloaded operator was indeed the culprit. Apparently, because your out variable does not have a type declared, the compiler decided to use the leftShift() version that takes an Object parameter and calls toString() on it, which for ByteArrayOutputStream uses the platform default encoding. So an alternative fix would be to declare the type of the variable.
Michael Borgwardt
Thanks again, Michael! Now it all makes sense. I think I'll be un-Groovy and declare types for things that I know the type of. It'll save me from problems
ראובן