views:

1395

answers:

2

Please can somebody show me a simple example of parsing some HTML using libxml.

#import <libxml2/libxml/HTMLparser.h>

NSString *html = @"<ul><li><input type=\"image\" name=\"input1\" value=\"string1value\" /></li><li><input type=\"image\" name=\"input2\" value=\"string2value\" /></li></ul><span class=\"spantext\"><b>Hello World 1</b></span><span class=\"spantext\"><b>Hello World 2</b></span>";

1) Say I want to parse the value of the input whose name = input2.

Should output "string2value".

2) Say I want to parse the inner contents of each span tag whose class = spantext.

Should output: "Hello World 1" and "Hello World 2".

+3  A: 

http://github.com/zootreeves/Objective-C-HMTL-Parser

I used the above HTML Parser to achieve what I wanted:

   NSError * error = nil;
   HTMLParser * parser = [[HTMLParser alloc] @"<ul><li><input type=\"image\" name=\"input1\" value=\"string1value\" /></li><li><input type=\"image\" name=\"input2\" value=\"string2value\" /></li></ul><span class=\"spantext\"><b>Hello World 1</b></span><span class=\"spantext\"><b>Hello World 2</b></span>"] error:&error];

   if (error) {
     NSLog(@"Error: %@", error);
     return;
  }
  HTMLNode * bodyNode = [parser body];

 NSArray * inputNodes = [bodyNode findChildTags:@"input"];

 for (HTMLNode * inputNode in inputNodes) {
      if ([inputNode getAttributeNamed:@"name"] isEqualToString:@"input2"]){
         NSLog(@"%@", [inputNode getAttributeNamed:@"value"]); //Answer to first question
      }
  }

 NSArray * spanNodes = [bodyNode findChildTags:@"span"];

 for (HTMLNode * spanNode in spanNodes) {
      if ([spanNode getAttributeNamed:@"class"] isEqualToString:@"spantext"]){
         NSLog(@"%@", rawContentsOfNode(xmlNode * spanNode, htmlDocPtr doc)); //Answer to second question

      }
  }

  [parser release];
StuR
A: 

I faced with the same task and when I try solution from StuR: NSLog(@"%@", rawContentsOfNode(xmlNode * spanNode, htmlDocPtr doc)); //Answer to second question compiler gives me error "Expected expression before 'xmlNode'". What it could be?

Artem Svystun