ansaurus

Question

Is there a way to specify a fixed (or variable) number of elements for lxml in Python

Answer 1

A:

Use something like simplehtmldom, and then provide an index?

Amber 2010-03-02 21:43:47

Answer 2

+1 A:

Does this do the trick?

from itertools import islice
ancestor = islice(theitem.iterancestors(), 4) # To get the fourth ancestor

EDIT I'm an idiot, that doesn't do the trick. You'll need to wrap it up in a helper function like so:

def nthparent(element, n):
    parent = islice(element.iterancestors(), n, n+1)
    return parent[0] if parent else None

ancestor = nthparent(theitem, 4) # to get the 4th parent

Will McCutchen 2010-03-02 21:48:58

I am playing with ancestor right now trying to figure out how to manipulate the objects in it. I see that I get four ancestors. Thanks

PyNEwbie 2010-03-02 22:36:34

@PyNEwebie see my edited answer. The code I gave you initially didn't do what you needed it to do.

Will McCutchen 2010-03-02 22:46:43

Thanks I understand more and this is helpful.

PyNEwbie 2010-03-06 19:30:59

`islice` returns an iterator therefore you should write `next(isclice(..), None)` instead of `parent[0] ..`

J.F. Sebastian 2010-03-06 20:13:39

Answer 3

+3 A:

lxml supports XPath:

from lxml import etree
root = etree.fromstring("...your xml...")

el, = root.xpath("//div[text() = 'the string']/preceding-sibling::*[9]")

J.F. Sebastian 2010-03-02 21:53:23

But I am a beginner how does this do me any better - and I am using html. I started with mytree=fromstring(thedocument) and then list_of_elements=mytree.cssselect('div')

PyNEwbie 2010-03-02 22:34:57

@PyNEwbie: The above xpath expression is just an example, it should be something like `elements[-1].xpath("preceding-sibling::div[9]")` in your case.

J.F. Sebastian 2010-03-02 22:59:55

I've added combined xpath expression

J.F. Sebastian 2010-03-02 23:23:34

ansaurus

tags:

views:

answers:

Is there a way to specify a fixed (or variable) number of elements for lxml in Python

related questions