regex

evaluating a user-defined regex from STDIN in Perl

I'm trying to make an on-the-fly pattern tester in Perl. Basically it asks you to enter the pattern, and then gives you a >>>> prompt where you enter possible matches. If it matches it says "%%%% before matched part after match" and if not it says "%%%! string that didn't match". It's trivial to do like this: while(<>){ chomp; ...

How can I check if a regex pattern is valid in Perl?

Firstly, I was wondering if there was some kind of built in function that would check to see if a regex pattern was valid or not. I don't want to check to see if the expression works - I simply want to check it to make sure that the syntax of the pattern is valid - if that's possible. If there is no built in function to do so, how do I...

In Perl, how can I read parts of lines that match a criterion?

Sample Data: 603 Some garbage data not related to me, 55, 113 -> 1-ENST0000 This is sample data blh blah blah blahhhh 2-ENSBTAP0 This is also some other sample data 21-ENADT)$ DO NOT WANT TO READ THIS LINE. 3-ENSGALP0 This is third sample data node #4 This is 4th sample data node #5 ...

How can I remove an html element and it's contents using RegEx

I have a div id like to remove from an output which looks like <div id="ithis" class="cthis">Content here which includes other elements etc..) </div> How can I remove this div and everything within it using PHP and regex? Thank you. ...

Boost::regex issue, Matching an HTML span element

Dears, I don't get it. I created this regular expression: <span class="copy[Green|Red].*>[\s]*(.*)[\s]*<\/span> to match certain parts of HTML code (a part between spans). For instance the following: <span class="copyGreen">0.12</span> <span class="copyRed"> 0.12 </span> Now, this works beautifully with RegexBuddy and others, ...

Capturing “xxxxxxxxxx”

This is a pretty simple question but I'm somewhat stumped. I am capturing sections of text that match "xxxxxxxxxx". It works fine. string pattern = "(?<quotePair>\"[^/\"]*\")"; Now I want to make a new pattern to capture “xxxxxxxxxx”... I used: string pattern2 = "(?<lrquotePair>“[^/\"“]*”)"; For some reason the second pattern w...

PHP: Regular Expression to delete text from a string if condition is true

Hi, I have a variable containing a string. The string can come in several ways: // some examples of possible string $var = 'Tom Greenleaf (as John Dunn Hill)' // or $var = '(screenplay) (as The Wibberleys) &' // or $var = '(novella "Four Past Midnight: Secret Window, Secret Garden")' I need to clean up the string to get only the first...

awk extract multiple groups from each line

How do I perform action on all matching groups when the pattern matches multiple times in a line? To illustrate, I want to search for /Hello! (\d+)/ and use the numbers, for example, print them out or sum them, so for input abcHello! 200 300 Hello! Hello! 400z3 ads Hello! 0 If I decided to print them out, I'd expect the output of 20...

Regex blows up in an editor extension

I am building a Visual Studio editor extension for my Django rendering engine. I just started it so so far it is really simple and so far it does what I expect it to do - highlighting and the such. Or it did until I started to add parsing logic. Part of the parsing relies on regular expressions. And here is my problem: No matter how I t...

How do I strip data from HTML tags

Say I have data like this: <option value="abc" >Test - 123</option> <option value="def" >Test - 456</option> <option value="ghi" >Test - 789</option> Using PHP, how would I sort through the HTML tags, returning all text from within the option values. For instance, given the code above, I'd like to return 'Test - 123', 'Test - 456', '...

Why does it seem like the * in Perl regex isn't being greedy?

I expected this to print "[b]" but it prints "[]": $x = "abc"; $x =~ /(b*)/; print "[$1]"; If the star is replaced with a plus, it acts as I expect. Aren't both plus and star supposed to be greedy? ADDED: Thanks everyone for pointing out (within seconds, it seemed!) that "b*" matches the empty string, the first occurrence of which i...

Can someone explain Possessive Quantifiers to me? (Regular Expressions)

I am reading the PCRE doc, and it refers to possessive quantifiers, but does not explicitly or specifically define them. I know what a greedy quantifier is, and I know what a lazy quantifer is. But possessive? The PCRE man page seems to be cheating when it uses the term without defining it. The man page specifically states that the ...

Regular Expression to match every new line chraracter (\n) inside a <content> tag

Hi, I'm looking for a regular expression to match every new line character (\n) inside a XML tag which is < content >, or inside any tag which is inside that < content > tag, for example : <blog> <text> (Do NOT match new lines here) </text> <content> (DO match new lines here) <p> (Do match new lines here) </p> </content> (Do NOT match ...

I need a regex that validates for minimum 7 digits in the given string.

Hi I wanna validate a phone number. My condition is that I want mimimum 7 numbers in the given string, ignoring separators, X, parantheses. Actually I want to achieve this function in regex: Func<string, bool> Validate = s => s.ToCharArray().Where(char.IsDigit).Count() >= 7; Func<string, bool> RegexValidate = s => System.Text.RegularE...

Interpret this particular REGEX

I did a REGEX pattern some time ago and I don't remember its meaning. For me this is a write-only language :) Here is the REGEX: "(?!^[0-9]*$)(?!^[a-zA-Z]*$)^([a-zA-Z0-9]{8,10})$" I need to know, in plain English, what does it means. ...

Convert Pascal-cased setter to an underscore-separated variable name

This is not as simple as it seems. Most of you are likely thinking of the regex /([A-Z])/_$1/ like I have found all over the Internet, but my particular situation is slightly more complicated. My source string contains more content that I don't want to have converted before a portion that I do. Consider a regular setter: public funct...

IIS 7.0 URL Rewrite Module - Root URL doesn't display

Hi, I'm trying to setup a PHP website in IIS 7.0 with URL rewriting enabled using this module (http://www.iis.net/downloads/default.aspx?tabid=34&amp;g=6&amp;i=1691) I've got the whole thing running fine for inner pages of the site, but my root URL "/" doesn't work any more. I want the structure of my URLs to be www.test.com/test-page...

.Net Regex Match Problem

Hello there, I am somewhat confused right now with a obviously pretty simple regex but it must be the lack of caffein or the weather today. Basically what I have is a string that can be something like 'sw' or 'ee' or 'n.a.'. Now what I want & need is a regex.match that gives me back '' in case the provided string is 'n.a.', in all othe...

Regex/Javascript Help - Search URL term Parsing

I am building a 'keyword' highlighting script and I need to write a regular expression to parse the following url, to find the searched for keywords to highlight. I am REALLY bad with regex, so I was hoping someone who rocks it, could help. Our search string uses "skw" as the parameter and "%2c" (comma) to separate terms, with "+" for...

Improving my regular expression skills

I've been wanting to improve my regex skills for quite some time now and "Mastering Regular Expressions" was recommended quite a few times so I bought it and have been reading it over the past day or so. I have created the following regular expression: ^(?:<b>)?(?:^<i>)?<a href="/site\.php\?id=([0-9]*)">(.*?) \(([ a-z0-9]{2,10})\)</a>(...