Assumption 1. You are only interested in the data in the p (paragraph) element and that you are using NSXMLParser.
Assumption 2. You want to keep any element inside of p intact.
The strategy that you want to use is to create a state machine for your parser so that it knows when it needs to save data and when to ignore data as it is received.
Set up your NSXMLParser delegate
using the sample code from Apple.
Your delegate will need an ivar BOOL inParagraph
for tracking when data will be retained or discarded. The initial value of inParagaph
is NO
.
When your delegate receives the parser:didStartElement:namespaceURI:qualifiedName:attributes:
message, if ([element isEqual:@"p"])
clear your receivedData
variable and set inParagraph = YES
EDIT: receivedData is an NSMutableString. Fixed the code examples
At this point your parser delegate
wants to save data received.
When the parser delegate
receives the parser:foundCharacters:
message, append the string to receivedData
as in the sample code.
- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string
{
if (inParagraph) [receivedData appendString:string];
}
When the parser encounters the inline element, the delegate will receive the parser:didStartElement:namespaceURI:qualifiedName:attributes:
again. This is when the inParagraph
state variable is important. The parser will not receive the enclosing '<' and '>' characters of an element, so you will have to wrap the elementName
in the '<' and '>' characters and add to receivedData
. Something like
- (void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qualifiedName attributes:(NSDictionary *)attributeDict
{ if (inParagraph)
{
NSString *inlineElementName = [NSString stringWithFormat:@"<%@>", elementName];
[receivedData appendString:inlineElementName];
}
....
}
When the parser delegate
receives the parser:didEndElement:namespaceURI:qualifiedName:
message, it checks whether it is in the "p" element, if (inParagraph && ![elementName isEqual:@"p"]
, close the inline element. if ([elementName isEqual:@"p"])
add the contents of receivedData
to the NSMutableArray
holding your paragraphs.
- (void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName
{
if (inParagraph)
{
if (![elementName isEqual:@"p"])
{
NSString *inlineElementName = [NSString stringWithFormat:@"</%@>", elementName];
[receivedData appendString:inlineElementName];
} else { // received closing </p> tag add receivedData to the paragraph array
[paragraphsArray addObject:[receivedData copy]];
[self setInParagraph:NO];
}
}
}
}