views:

103

answers:

4
<body>
 <div>some text</div>
 I NEED THIS TEXT ONLY
 <div>some text</div>
 more text here
 <div>some text</div>
 one more text here
 <div>some text</div>
</body>

How?

A: 

I don't have nokogiri, but here's an alternative using just basic string manipulation.

html=<<EOF
<body>
 <div>some text</div>
 I NEED THIS TEXT ONLY
 <div>some text</div>
 more text here
 <div>some text</div>
 one more text here
 <div>some text</div>
</body>
EOF
p html.split(/<\/*body>/)[1].split(/<\/div>/)[1].split(/<div>/)[0]
ghostdog74
**Really?** String manipulation instead of parsing?
Alejandro
O.M.G., if this was a Perl question lightning would have struck. For anything but the most trivial task string manipulation and/or its cousin REGEX will fail badly. For fun search for perl, regex and parsing html.
Greg
OP's requirement is trivial. For me, there's no reason to use nokogiri or some other parsing tool.
ghostdog74
A: 

this returns the first text node within body between two div elements:

/body/text()[
     ./preceding::element()[1][local-name()="div"] and 
     ./following::element()[1][local-name()="div"]
][1]

should return

I NEED THIS TEXT ONLY
Dennis Knochenwefel
No, i need exacly get text between two divs.
amirka
I corrected the post accordingly. Does it work now?
Dennis Knochenwefel
I check in weekends, thx for example
amirka
A: 

This XPath 1.0:

/body/text()[preceding-sibling::*[1][self::div]]
            [following-sibling::*[1][self::div]][1]

Also:

/body/text()[normalize-space()][1]
Alejandro
A: 

Use:

/*/div[1]/following-sibling::text()[1]

This selects the first text-node sibling of the first div child of the top element of the document.

Dimitre Novatchev