views:

48

answers:

3

I am trying to parse a string and get another string in the middle.

ie.

Hello world this is a string

I need to find the string between "world" and "is" (this). I have looked around but haven't been able to figure it out yet, mainly because I am new to Objective C... Anyone have an idea of how to do this, with RegEx or without?

+1  A: 

See the ICU user guide on regular expressions.

If you know there'll just be one result:

NSRegularExpression *regex = [NSRegularExpression
    regularExpressionWithPattern:@"\bworld\s+(.+)\s+is\b" options:0 error:NULL]

NSTextCheckingResult *result = [regex firstMatchInString:string
    options:0 range:NSMakeRange(0, [string length]];

// Gets the string inside the first set of parentheses in the regex
NSString *inside = [string substringWithRange:[result rangeAtIndex:1]];

The \b makes sure there's a word boundary before world and after is (so "hello world this isn't a string" wouldn't match). The \s gobbles up any whitespace after world and before is. The .+? finds what you're looking for, with the ? making it non-greedy so that "hello world this is a string hello world this is a string" doesn't give you "this a string hello world this".

I'll leave it up to you to figure out how to handle multiple matches. The NSRegularExpression documentation should help you out.

If you want to make sure the match doesn't cross sentence boundaries, you could do ([^.]+?) instead of (.+?), or you could use enumerateSubstringsInRange:options:usingBlock: on your string and pass NSStringEnumerationBySentences in the options.

This stuff all needs 4.0+. If you want to support 3.0+, look into RegexKitLite.

Jacques
Thanks for the help! I will need to support OS3/iPad, so that isn't currently an option. Once OS 4 comes out for iPad I could switch over to this implementation using Regex.
Andrew M
+1  A: 
Devara Gudda
+1  A: 

The regular expressions solution that Jacques gives works, and the caveat of requiring iOS 4.0 and later is true, meaning it does not work on current iPads. Using regular expressions is also quite slow, and an overkill if the search expressions are known string constants.

You can solve the problem using methods on NSString, or a class named NSScanner, both have been available since iPhone OS 2.0 and long before that, since before Mac OS X 10.0 actually :).

So what you want is a new method on NSString like this?

@interface NSString (CWAddition)
-(NSString*)stringBetweenString:(NSString*)start andString:(NSString)end;
@end

No problem, and we assume we should return nil is no such strings could be found.

The implementation using NSString only is quite straight forward:

@implementation NSString (NSAddition)
-(NSString*)stringBetweenString:(NSString*)start andString:(NSString)end {
    NSRange startRange = [self rangeOfString:start];
    if (startRange.location != NSNotFound) {
        NSRange targetRange;
        targetRange.location = startRange.location + startRange.length;
        targetRange.length = [self length] - targetRange.location;   
        NSRange endRange = [self rangeOfString:end options:0 range:targetRange];
        if (endRange.location != NSNotFound) {
           targetRange.length = endRange.location - targetRange.location;
           return [self substringWithRange:targetRange];
        }
    }
    return nil;
}
@end

Or you could do the implementation using the NSScanner class:

@implementation NSString (NSAddition)
-(NSString*)stringBetweenString:(NSString*)start andString:(NSString)end {
    NSScanner* scanner = [NSScanner scannerWithString:self];
    [scanner setCharactersToBeSkipped:nil];
    [scanner scanUpToString:start intoString:NULL];
    if ([scanner scanString:start intoString:NULL]) {
        NSString* result = nil;
        if ([scanner scanUpToString:end intoString:&result]) {
            return result;
        }
    }
    return nil;
}
@end
PeyloW
Note that this solution doesn't make sure that world and is are on word boundaries. You could add more code to deal with that, but it can be tricky so I would suggest just using regular expressions (either NSRegularExpression or RegexKitLite for iOS 3) to make it easier to deal with all of the corner cases. Do some profiling to see if you need to hand-code a solution.
Jacques
Thanks! I also am going to have to find strings that occur multiple times as well...ie. <tag>Hello</tag><tag>Bye</tag>I would assume that I could just add in another parameter (occurrence) and then loop through the starting string X times, right? I should be able to figure it out, especially since you gave me the starting point of NSRange, etc. Thanks again!
Andrew M
Yes, should be no problem. Just use the ranges from the previous search to limit the search range of the next search.
PeyloW