Sorry, this is probably really easy. But if you have a delimiter character on each line and you want to find all of the text before the delimiter on each line, what regular expression would do that? I don't know if the delimiter matters but the delimiter I have is the % character.
Your text will be in group 1.
/^(.*?)%/
Note: This will capture everything up the percent sign. If you want to limit what you capture replace the . with the escape sequence of your choice.
you don't have to use regex if you don't want to. depending on the language you are using, there will be some sort of string function such as split().
$str = "sometext%some_other_text";
$s = explode("%",$str,2);
print $s[0];
this is in PHP, it split on % and then get the first element of the returned array. similarly done in other language with splitting methods as well.
In python, you can use:
def GetStuffBeforeDelimeter(str, delim):
return str[:str.find(delim)]
In Java:
public String getStuffBeforeDelimiter(String str, String delim) {
return str.substring(0, str.indexOf(delim));
}
In C++ (untested):
using namespace std;
string GetStuffBeforeDelimiter(const string& str, const string& delim) {
return str.substr(0, str.find(delim));
}
In all the above examples you will want to handle corner cases, such as your string not containing the delimeter.
Basically I would use substringing for something this simple becaues you can avoid scanning the entire string. Regex is overkill, and "exploding" or splitting on the delimeter is also unnecessary because it looks at the whole string.
You don't say what flavor of regex, so I'll use Perl notation.
/^[^%]*/m
The first ^
is a start anchor: normally it matches only the beginning of the whole string, but this regex is in multiline mode thanks the 'm' modifier at the end. [^%]
is an inverted character class: it matches any one character except a '%'. The *
is a quantifier that means to match the previous thing ([^%]
in this case) zero or more times.