ansaurus

Question

grabbing text between two elements in nokogiri?

Answer 1

A:

I don't have nokogiri, but here's an alternative using just basic string manipulation.

html=<<EOF
<body>
 <div>some text</div>
 I NEED THIS TEXT ONLY
 <div>some text</div>
 more text here
 <div>some text</div>
 one more text here
 <div>some text</div>
</body>
EOF
p html.split(/<\/*body>/)[1].split(/<\/div>/)[1].split(/<div>/)[0]

ghostdog74 2010-10-12 04:33:32

**Really?** String manipulation instead of parsing?

Alejandro 2010-10-12 14:39:46

O.M.G., if this was a Perl question lightning would have struck. For anything but the most trivial task string manipulation and/or its cousin REGEX will fail badly. For fun search for perl, regex and parsing html.

Greg 2010-10-12 22:23:32

OP's requirement is trivial. For me, there's no reason to use nokogiri or some other parsing tool.

ghostdog74 2010-10-12 23:28:34

Answer 2

A:

this returns the first text node within body between two div elements:

/body/text()[
     ./preceding::element()[1][local-name()="div"] and 
     ./following::element()[1][local-name()="div"]
][1]

should return

I NEED THIS TEXT ONLY

Dennis Knochenwefel 2010-10-12 07:37:16

No, i need exacly get text between two divs.

amirka 2010-10-12 08:29:32

I corrected the post accordingly. Does it work now?

Dennis Knochenwefel 2010-10-12 13:33:38

I check in weekends, thx for example

amirka 2010-10-14 22:45:05

Answer 3

A:

This XPath 1.0:

/body/text()[preceding-sibling::*[1][self::div]]
            [following-sibling::*[1][self::div]][1]

Also:

/body/text()[normalize-space()][1]

Alejandro 2010-10-12 14:34:40

Answer 4

A:

Use:

/*/div[1]/following-sibling::text()[1]

This selects the first text-node sibling of the first div child of the top element of the document.

Dimitre Novatchev 2010-10-12 14:36:30

ansaurus

tags:

views:

answers:

grabbing text between two elements in nokogiri?

related questions