views:

49

answers:

5

I have a very strange problem:

I use xsl to show an html picture where the source is defined in the xml file like this:

 <pic src="..\_images\gallery\smallPictures\2009-03-11 אפריקה ושחור לבן\020.jpg" width="150" height="120" /> 

[the funny chars are Hebrew- ;) ]

Now comes the strange part:

  1. When testing the file locally it works on Firefox and Safari but NOT in IE and opera. (file://c:/file.xml)

  2. Next I send the file to the host throw FTP (nothing more)

  3. Than it suddenly works with all browsers when calling the page from the host: (http://www.host/file.xml)

The question is how can the server send the xml file to my browser in a way that my browser can read, while the same browser cannot read the same file stored locally ?!

I always thought that both HTML(xml) and pictures are sent to the client which is responsible to load the page - so how come the same files works for my webhost provider and not for me?

And what makes it totally strange is that IE is not alone - Opera joins it with this strange behavior.

Any ideas?

Thanks alot Asaf

+2  A: 

When you open the file locally, there is no server to serve up HTTP headers. That's a big difference at least. Try examining the coding the browser thinks the page is in, when it's opened manually from disc, and when served over HTTP.

If headers are set correctly by either your script, or the server, then that is likely why.

Svend
if the path + file name are both a-z chars... there is no problem... the hebrew chars are without any doubt, the problem. Yet I can not understand why sometimes the browsers overcome this problem and sometimes not...
Asaf
It has to do with the encoding the browser comes up with when it has to guess (which is the case when there's no server to tell it the encoding). Some will come up with ASCII, some UTF-8, and some will try some random 8-bit code page. The encoding will determine the meanings of the non-ASCII characters -- and if the browser picks the wrong encoding, the URLs will be gibberish.
cHao
@Svend: Or, if the server just *happens* to default to the same encoding the XML file uses. Could just be a happy accident that the server's right.
cHao
@cHao: Well yes, whether by intent, or accident, the HTTP server is definently something that differs from each situation. But really, without seeing more code, such as the XSLT transformation sheet, and so forth, it's really hard to say much, it'll mostly be qualified guess work.
Svend
+1  A: 

this is all to do with your directory layout. You seem to be suggesting at home:

c:/
c:/ file.xml
c:/ _images /
c:/ _images / ... .jpg

When opening this file in the browser, it's incorrectly trying to go below the root of the drive (.. on c: is still c:). IE and opera are detecting this and not displaying the picture. Firefox and Safari are ignoring it, and simply loking in _images folder. On the server presumably the layout is different, which is why the pictures display fine.

Try the structure:

c:/ testfiles /
c:/ testfiles / file.xml
c:/ testfiles / _images /
c:/ testfiles / _images / ... .jpg

And the xml fragment:

<pic src="_images/ ... .jpg
Rudu
:) .... I made the same file structure on server and local machine, I thought that the full path is not relevant ... Sorry. but that part is OK. I appreciate your help and sorry for the inaccurate example
Asaf
Ah well then in that case look at the answer by @Svend - locally you're opening the file directly so the browser does all interpretation on the server, a web-server component is likely translating your XML with XSL and rending it to you in HTML.
Rudu
+1  A: 

This is most likely an encoding problem. Try to specify the encoding explicitly in the generated HTML page by including the following META element in the head of the page (assuming that your XSLT is set to generate UTF-8):

<html>
    <head>
        <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    ...
    </head>
...

This tells the browser to use UTF-8 encoding when rendering the page (You can actually see the encoding used in Internet Explorer's Page -> Encoding menu).

The reason why this works when the page is served by your web server is that the web server tells the browser already what encoding the response has in one of the HTTP headers.

To get a basic understanding what encoding means I recommend you to read the following article:

The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets

0xA3
after loading the XML file the encodings are disabled in IE, adding your line to the xsl didn't make a change.
Asaf
+1  A: 
..\_images\gallery\smallPictures\2009-03-11 אפריקה ושחור לבן\020.jpg

that's a Windows filepath and not anything like a valid valid URI. You need to:

  • replace the \ backslashes with /;
  • presumably, remove the .., if you're expecting the file to be in the root directory;
  • replace the spaces (and any other URL-unfriendly punctuation) with URL-encoded versions;
  • for compatibility with browsers that don't properly support IRI (and to avoid page encoding problems) non-ASCII characters like the Hebrew have to be UTF-8-and-URL-encoded.

You should end up with:

<img src="_images/gallery/smallPictures/2009-03-11%20020/%D7%90%D7%A4%D7%A8%D7%99%D7%A7%D7%94%20%D7%95%D7%A9%D7%97%D7%95%D7%A8%20%D7%9C%D7%91%D7%9F%10.jpg"/>

There's no practical way you can convert filepath to URI in XSLT alone. You will need some scripting language on the server, for example in Python you'd use nturl2path.pathname2url().

It's generally better to keep the file reference in URL form in the XML source.

bobince
Valid points, but hardly the cause of the problem, because a Windows file path should work locally on a Windows box.
0xA3
@bobince, browsers on Windows do fine with a Windows-style file path. But you have a good point about URL-encoding. @Asaf there are XSLT templates and functions around that will URL-encode your URL for you, so you can stay within XSLT.
LarsH
A: 

@Asaf, I believe @Svend is right. HTTP headers will specify content type, content encoding, and other things. Encoding is likely the reason for the weird behavior. In the absence of header information specifying encoding, different browsers will guess the encoding using different methods.

Try right-clicking on the page in the browser and "Show page info". Content encoding should be different when you serve it from a server, than when it's coming straight from your hard drive, depending on your browser.

LarsH
there is some progress understanding why it's not working in IE but it's really curious why it does not in opera:file:///D:/smallPictures/2009-08-25%20%D7%92%D7%9C%D7%A8%D7%99%D7%94%20%D7%90%D7%95%D7%92%D7%95%D7%A1%D7%98%202009/%D7%92%D7%9C%D7%A8%D7%99%D7%94%20%D7%90%D7%95%D7%92%D7%95%D7%A1%D7%98%202009%20008.jpg ... the hebrew chars are replaced with UTF-8 code - IE cannot read this path, Opera can.
Asaf
@Asaf: I wonder if the new IE 9 beta can read it. But in any case I don't have the knowledge or inclination to dig much deeper. Mazel tov!
LarsH
IE is not for me... talking about beta is blasphemy ;) and... Toda.
Asaf