tags:

views:

58

answers:

2

Hi everyone. I have a XML file with the structure as shown below:

<x>
   <y/>
   <y/>
   .
   .
</x>

The number of <y> tags are arbitrary.

I want to get the text of the <y> tags and for this I decided to use XPath. I have figured out the syntax, say for the first y: (Assume root as x)

textFirst = root.xpath('y[1]/text()')

This works as expected.

However my problem is that I won't be knowing the number of <y> tags beforehand, so to fix that, I did this:

>>> count = 0
>>> for number in root.getiterator('y'):
...     count += 1

So now I know that there are count number of y in x. (Is there a better way to get the number of tags? If yes, please suggest)

However, if I do this:

>>> def try_it(x):
...     return root.xpath('y[x]/text()')
... 
>>> try_it(1)
[]

It returns an empty list.

So my question is: not knowing the arbitrary number of tags, how do I get an XPath syntax or expression for it and using lxml?

Sorry if something is not clear, I tried my best to explain the problem.

+1  A: 

what about 'y[%i]/text()' % x ?

now you see where you did a mistake? :)

( .. note that you can capture all y elements together with xpath 'y' or '//y' )

mykhal
Ohhhhhhh! It works! Thank you so much. Very stupid of me. Is my method of getting the number of `y` tags OK or is there a shorter version?
sukhbir
PulpFiction: it happens :) i have updated the answer with a hint on simpler way how to do it simpler
mykhal
mykhal: Thanks for the help, you saved me lot of toil. Have a great day! :)
sukhbir
+1  A: 

To count the number of y nodes, you can use the XPath expression 'count(/x/y)'.

Also, I think the problem with your expression in the try_it function is that you appear to be using the literal value x instead of concatenating the input parameter into the XPath expression.

Maybe something like this would work:

 >>> def try_it(x):
...     return root.xpath('y[' + x + ']/text()')

Hope this helps!

mlschechter
count() was just what I needed. Thank you for the reply.
sukhbir
Why does count() return float?
sukhbir
@PulpFiction - `lxml` returns float for any XPath expression that returns a numeric result (in Java, the corresponding result is a `Double`). You should just be able to downcast it.
mlschechter
Thanks, I can do that however I was just curious as to why it returns a float.
sukhbir