ansaurus

Question

Answer 1

+1 A:

I would suggest re-writing it in that case. If it is not easy to understand, then it is more difficult to test.

Paul Lydon 2009-07-31 06:57:26

+1 Agreed. Any single line code that takes minutes to understand is a problem for maintenance and should therefore be rewritten. (And in this case, the code clearly IS "broke"!)

Stephen C 2009-07-31 07:33:53

How should I rewrite it? I was thinking about using a proper XML parser instead of this hack job, but, as mentioned, my colleague (who has more experience and seniority) told me not to fix it until proven broken (i.e., a proper XML parser is needed).

gineer 2009-07-31 08:20:04

@gineer: That's a hard question. A proper XML parser is likely to give you a more robust solution, but if you are GUARANTEED that the XML will be consistently formatted, well-formed, and conforming to the schema, ad-hoc parsing might be good enough. But even an extra space in an unexpected place can be enough throw off ad-hoc parsing, and there are theoretical variations that you probably wouldn't even consider.

Stephen C 2009-07-31 09:19:48

@gineer: After further thought, I'd say that you should use a proper XML parser UNLESS you can guarantee that you will never get malformed or differently formatted XML. If this is a web service, don't forget that someone might send you bad / weird XML just to see if they can "break" your service.

Stephen C 2009-08-02 06:49:08

Answer 2

+6 A:

What happens if you receive <fieldname/> ?

Igor Brejc 2009-07-31 06:57:47

This is a valid test case. I'm going to discuss this with my colleague. However, from your experience, is it likely for web service to generate such tags? I've used only one XML library so far, and it always generates <fieldname></fieldname> for empty tags.

gineer 2009-07-31 08:32:43

Since I don't know what kind of web service it is, I cannot tell you anything other than "it is likely, since it's valid XML" :). But in general, it is wise to code defensively: since the web service is an outside entity for you, you never know what kind of stuff it could return. Fixing a few lines of code now is much less expensive than later fixing possible bugs in the production. So I would suggest writing few unit tests around this code before you change it, and then refactor the code to be less error-prone. AND keep those unit tests running for the future.

Igor Brejc 2009-07-31 08:50:50

@gineer: "However, from your experience, is it likely for web service to generate such tags?". Try fiddling with the config options and you can probably get it to generate different-looking XML. Besides, how do you know that the XML was generated by an XML library? If your code is parsing XML by hand, the other end could be generating it by hand!

Stephen C 2009-07-31 09:25:54

Answer 3

+7 A:

Yes. It will throw an Exception if "<fieldname>" is not present in the response. Specifically it will try to call webResponse.substring(fieldName.length() + 1, -1);

"<fieldname/>" will cause similar problems, as will any attributes on the element.

And if you get "<fieldname> ... <fieldname> ... </fieldName> ... </fieldName>", you'll get the wrong answer.

EDIT: in the light of followup discussions, I'd say that this code should be rewritten to use a proper XML parser ... unless you / your team can guarantee that the code will never have to deal with problematic XML. XML simply allows too many valid (and invalid) variations to deal with by ad-hoc string manipulation.

Stephen C 2009-07-31 07:00:53

Of course, you might argue that these inputs are "impossible" ... and that's probably what the contractor thought. But "impossible" things may start happening when you change the system.

Stephen C 2009-07-31 07:47:00

Answer 4

+1 A:

Nested tags. You'll start at the first open tag, skip any others that exist and stop on the first close tag instead of the matching close tag.
There is a close tag somewhere before the open tag (you're searching for the close tag from the beginning of the string, not the end of the open tag)

Spyder 2009-07-31 07:02:59

+1 for the out-of order example -- that's nasty. :)

David Moles 2009-07-31 14:06:08

Answer 5

+2 A:

Instead of manually parsing XML it's better to use a real XML parser. There are all kinds of corner cases that are hard to cover with plain string manipulation. It will be more readable with a real parser also. It's best to consider XML data to be binary data, especially when taking all the possible character encodings into consideration.

Esko Luontola 2009-07-31 08:33:37

Answer 6

+2 A:

In addition to Igor Brejc and Stephen C's responses above, there's CDATA:

<fieldname><![CDATA[ I am not really </fieldname> ]]></fieldname>

or even

<othertag>
  <![CDATA[ I am not really <fieldname> and there is no closing tag ]]>
</othertag>

David Moles 2009-07-31 14:05:31

See my comment "... and there are theoretical variations that you probably wouldn't even consider". It applies to me too :-)

Stephen C 2009-08-02 06:38:49

ansaurus

tags:

views:

answers:

Test case to fail this code?

related questions