tags:

views:

230

answers:

2

Hi,

For a Jericho Element, I am trying to find out how to loop over all child nodes, whether an element or plain text.

Now there is Element.getNodeIterator(), but this references ALL descendants within the Element, not just the first descendants.

I need the equivalent of Element.getChildSegments(). Any ideas?

Thanks

A: 
Gunslinger47
btw, you should "accept" an answer to one of your questions if it helped you. Currently, you're showing a 0% accept rate.
Gunslinger47
wow - this looks great, thanks for the help. on my way out now, i'll try it in the morning. Also - thanks for the tip on accepting answers - I'll sort that too. Cheers r.
Richard
I've amended your suggested solution to include first descendant non-text as well. thanks for help
Richard
Thanks for accepting your answers. :)
Gunslinger47
A: 

Using the methodology from Gunslinger47 above, the following returns immediate (first descendant) child segments for the Element elem:

public static List<Segment> getChildSegments(Element elem) {

 final Iterator<Segment> it = elem.getContent().getNodeIterator();
    final List<Segment> results = new LinkedList<Segment>();
    final List<Element> children = elem.getChildElements();

    while (it.hasNext()) {
     Segment cur = it.next();
     if (!(cur instanceof Tag) && !(cur instanceof CharacterReference) && !cur.isWhiteSpace()) {
      boolean enclosed = false;
      for (Element child : children) {
       if (child.encloses(cur)) { 
        enclosed = true;
       }
      }
      if (!enclosed) results.add(cur);
     } else {
      for (Element child : children) {
       if (child.getStartTag().equals(cur)) {
        results.add(cur);
        break;
       }
      }
     }
    }
    return results;
}
Richard
I updated my original answer to show how I'd incorporate the above suggestions.
Gunslinger47