questions about lookahead | ansaurus

lookahead

RegEx matching with no single letter delimiter

Medicare Eligibility EDI Example Responses is what I'm trying to match. I have a string that looks like this: LN:SMITHbbbbbbbbFN:SAMANTHAbbBD:19400515PD:1BN:123456PN:9876543210GP:ABCDEFGHIJKLMNOID:123456789012345bbbbbPC:123PH:8005551212CD:123456PB:123ED:20060101TD:2070101LC:NFI:12345678FE:20070101FT:20080101 I need a set of matches th...

negative-lookahead

Regular Expressions: Is there an AND operator?

Obviously, you can use | (pipe?), to represent OR, but can you match 'and' as well? Specifically, I'm wanting to match paragraphs of text that contain ALL of a certain phrase, but in no particular order. ...

pattern-matching

Need a regex to match a variable length string of numbers that can't be all zeros

I need to validate an input on a form. I'm expecting the input to be a number between 1 to 19 digits. The input can also start with zeros. However, I want to validate that they are not all zeros. I've got a regex that will ensure that the input is numeric and between 1 and 19 numbers. ^\d[1,19]$ But I can't figure out how to include a...

input-validation

negative-lookahead

Isn't it possible to use 'Repeats' in the lookaheads for boost:regex?

I'm trying to extract some variables in my C++ code nested in blocks for example, if I have DEL_TYPE_NONE, DEL_TYPE_DONE, DEL_TYPE_WAIT, I'd like to match "DEL_TYPE_NONE" "DEL_TYPE_DONE" "DEL_TYPE_WAIT" I made my pattern like this, std::string pat("(?<=^[ \\t]?)[A-Z0-9_]+(?=,$)"); but I'm keep getting error message when compil...

Javascript won't split using regex

Since I started writing this question, I think I figured out the answers to every question I had, but I thought I'd post anyway, as it might be useful to others and more clarification might be helpful. I was trying to use a regular expression with lookahead with the javascript function split. For some reason it was not splitting the st...

Optional characters in a regex

The task is pretty simple, but I've not been able to come up with a good solution yet: a string can contain numbers, dashes and pluses, or only numbers. ^[0-9+-]+$ does most of what I need, except when a user enters garbage like "+-+--+" I've not had luck with regular lookahead, since the dashes and pluses could potentially be anywhe...

Regex that matches anything before a certain character?

I have to parse a bunch of stats from text, and they all are formatted as numbers. For example, this paragraph: A total of 81.8 percent of New York City students in grades 3 to 8 are meeting or exceeding grade-level math standards, compared to 88.9 percent of students in the rest of the State. I want to match just the 81 a...

javacc parseException... lookahead problem?

I'm writing a parser for a very simple grammar in javacc. It's beginning to come together but at the moment I'm completely stuck on this error: ParseException: Encountered "" at line 4, column 15. Was expecting one of: The line of input in question is z = y + z + 5 and the production that is giving me problems is my expression w...

String negation using regular expressions

Is it possible to do string negation in regular expressions? I need to match all strings that do not contain the string "..". I know you can use ^[^\.]*$ to match all strings that do not contain "." but I need to match more than one character. I know I could simply match a string containing ".." and then negate the return value of the...

negative-lookahead

How can I combine a positive and negative condition in a regex?

I fairly new to regular expressions and need some help. I need to filter some lines using regex in Perl. I am going to pass the regex to another function so it needs to be done in a single line. I want to select only lines that contain "too long"and that don't begin with "SKIPPING" Here are my test strings: SKIPPING this bond sinc...

negative-lookahead

Using lookahead with generators

Hi all, I have implemented a generator-based scanner in Python that tokenizes a string into tuples of the form (token type, token value): for token in scan("a(b)"): print token would print ("literal", "a") ("l_paren", "(") ... The next task implies parsing the token stream and for that, I need be able to look one item ahead fr...

InputStreamReader.markSupported is false

Hello, I need to “un-read” characters from an InputStreamReader. For that purpose I wanted to use mark and reset but markSupported returns false for the InputStreamReader class, since it doesn’t maintain an internal buffer and/or queue of characters. I know about BufferedInputStream and PushbackInputStream but neither is appropriate he...

Regular expression negative lookahead

In my home directory I have a folder drupal-6.14 that contains the Drupal platform. From this directory I use the following command: find drupal-6.14 -type f -iname '*' | grep -P 'drupal-6.14/(?!sites(?!/all|/default)).*' | xargs tar -czf drupal-6.14.tar.gz What this command does is gzips the folder drupal-6.14, excluding all subfold...

negative-lookahead

Replace all "\" characters which are not inside "<code>" tags

First things first: Neither this, this, this nor this answered my question. So I'll open a new one. Please read Okay okay. I know that regexes are not the way to parse general HTML. Please take note that the created documents are written using a limited, controlled HTML subset. And people writing the docs know what they're doing. They ...

Problem with look behind assertion and optional substring

I'm trying to write some regex that will parse information from alerts generated by Hyperic HQ. The alerts come in as emails with a subject line like: "[HQ] !!! - Alert: My Demo Website Alert Resource: demo.myserver.net Apache Web Server State: fixed" To cut a very long story short, I need to be able to consistently grab the "Apache W...

LookAhead Regex in .Net - unexpected result

Hello, I am a bit puzzled with my Regex results (and still trying to get my head around the syntax). I have been using http://regexpal.com/ to test out my expression, and its works as intended there, however in C# its not as expected. Here is a test - an expression of the following: (?=<open>).*?(?=</open>) on an input string of: <op...

Naming convetion of regex,lookahead and lookbehind

Why is it counter intuitive? /(?<!\d)\d{8}(?!\d)/,here (?<!\d) comes first,but called lookbehind,(?!\d) next,but called lookahead.All are counter intuitive. What's the reason to name it this way? ...

Change Password Control RegEx validating oddly in IE 7 only

I'm using the Asp.net change password control in my application and all seems to be find and dandy until a user tells me she has a problem meeting the strength requirements when changing her password. Looking into this, she is using IE 7 and no matter what she puts in, the validation fails (and ONLY in IE 7. Firefox, IE 8, Chrome etc. al...

internet-explorer-7

Why does a positive lookahead lead to captures in my Perl regex?

Hi everyone, I can't get why this code work: $seq = 'GAGAGAGA'; my $regexp = '(?=((G[UCGA][GA]A)|(U[GA]CG)|(CUUG)))'; # zero width match while ($seq =~ /$regexp/g){ # globally my $pos = pos($seq) + 1; # position of a zero width matching print "$1 position $pos\n"; } I know this is a zero width match and it dosn't put the ma...

Regex Replace Between " Encoding

I want to be able to replace style="STUFF" I keep thinking that this is the correct REGEX: style=(")(?!")*(") But for some reason that won't match. Any ideas? ...

1
2