views:

29

answers:

1

Hello Everybody,

I encounter a unbelievable strange problem:

The libxml parser I'm using makes a difference between " and ".

Is there one? The following attribute makes the error:

name="New Headway_the third edition"

if I replace the " character with the " I enter with my keyboard everything is working fine... I allready proved, that there are not just two ' characters next to each other... The parser returns the following error:

:72: parser error : invalid character in attribute value
<TopCont id="1197" name="New Headway_the thir...
                         ^

The really strange thing is that the attribute is coming from a web service which does working well exept to his TopCont... The characters in a very normal way!

Thanks for your Help, Markus

+2  A: 

The first one is a " - ascii code 34 - this is the valid double quote to use in an XML file.

The other one is some sort of fancy open double quote (or close double quote, I can't quite tell). The fact that it looks quite like the character with code 34 is irrelevant from the XML parsers point of view.

The parser will only accept " (34) as a quote. You can't use any other character and expect it to work - it's like just using any other character at random and expecting it to make sense :

<TopCont id="1197" name=¢New Headway_the thir...

The only reason you're confused is because " and " look the same to a human; the parser only cares about it's character code :)


NB It's very odd that a web service will return both types of quote - this tells me that someone might have cut-and-paste from somewhere else i.e. Word ? What is the web service?

deanWombourne
Hi deanWombourne, great response! Thanks a lot! is there a possibility how I can remove such unknown characters? Im looking for a method eighter in rails or in activescript (flex) Any idea?
Markus
My first suggestion would be to email the provider of the web service and tell them that they're doing it wrong! Failing that, you could write an error handler that if the XML parse fails will look at the character that caused it to fail and replace it with the correct one (using a lookup table i.e. replace opening double quote with ascii double quote etc) ?
deanWombourne