regex

Regular expressions in J2ME

If I wanted to implement a regex engine in JavaME (Which lacks the regex libraries), where would be the best place to start? I'm imagining there is existing regex code out there which it would be possible to use as a starting point for porting. Failing that, a good guide on how to compile and execute a regular expression would do. ...

is_numeric, intval, ctype__digit.. can you rely on them?

is_numeric, intval, ctype__digit.. can you rely on them? or do i have to use regex? function isNum($str) { return (preg_match("/^[0-9]+$/", $str)); } what do you guys think? am i stupid? ...

Regular Expression to escape HTML ampersands while respecting CDATA

I've written a content management system that uses a server-side regular expression to escape ampersands in the page response just prior to it being sent to the client's browser. The regular expression is mindful of ampersands that have already been escaped or are part of an HTML entity. For example, the following: a amp; d, © 20...

Improving/Fixing a Regex for C style block comments

I'm writing (in C#) a simple parser to process a scripting language that looks a lot like classic C. On one script file I have, the regular expression that I'm using to recognize /* block comments */ is going into some kind of infinite loop, taking 100% CPU for ages. The Regex I'm using is this: /\*([^*]|[\r\n]|(\*+([^*/]|[\r\n])))*\*...

Take multiples matches with regex separated by defined marks

Hello. I have a text and I need to take the content in a defined pattern. A content between MARK1 and MARK2 and content after MARK2. However, those marks can repeat and I need to take all their ocurrences. In the example below: text: "textA textB _MARK1_ textC _MARK2_ textD _MARK1_ textE textF _MARK2_ textG textH textI" array(0): _MARK...

BNF to Regex

Is there any way to convert the following BNF into a .Net regex? (I'm not stuck on the BNF, but I thought it might be the best way to explain what I was trying to do) <field> ::= "<<" <fieldname> <options> ">>" <options> ::= "" | "(" <option> ")" <option> ::= "" | <option> <non-paren> | <option> <escaped-c...

Is there a good browser based sandbox to practice regex?

I am looking for recommendations for a browser based regex sandbox to practice some proof of concept expressions. ...

Python regular expressions - how to capture multiple groups from a wildcard expression?

I have a Python regular expression that contains a group which can occur zero or many times - but when I retrieve the list of groups afterwards, only the last one is present. Example: re.search("(\w)*", "abcdefg").groups() this returns the list ('g',) I need it to return ('a','b','c','d','e','f','g',) Is that possible? How can I do i...

Regex to match attributes in HTML?

Hi, I have a txt file which actually is a html source of some webpage. Inside that txt file there are various strings preceded by a "title=" tag. e.g. <div id='UWTDivDomains_5_6_2_2' title='Connectivity Framework'> I am interested in getting the text Connectivity Framework to be extraced and written to a separate file. Like this...

How do I learn Regular Expressions?

I been coding in php, perl, ruby, php for around 2 years. and i still having problem with regex. Is it any tip you have for learning regex? or Great Guide? Duplicate of http://stackoverflow.com/questions/4736/learning-regular-expressions ...

Why doesn't my email regex for PHP work?

I have the same expression in Javascript but it won't work in PHP for server side validation. Here's the code if (ereg('/^([a-zA-Z0-9_.-])+@([a-zA-Z0-9_.-])+\\.([a-zA-Z])+([a-zA-Z])+/',$_POST['email-address'])) echo "valid email"; else echo "invalid email"; ...

Regex in java question, multiple matches

I am trying to match multiple CSS style code blocks in a HTML document. This code will match the first one but won't match the second. What code would I need to match the second. Can I just get a list of the groups that are inside of my 'style' brackets? Should I call the 'find' method to get the next match? Here is my regex pattern...

Regex: Matching by exclusion, without look-ahead - is it possible?

In some regex flavors, [negative] zero-width assertions (look-ahead/look-behind) are not supported. This makes it extremely difficult (impossible?) to state an exclusion. For example "every line that does not have "foo" on it", like this: ^((?!foo).)*$ Can the same thing be achieved without using look-around at all (complexity and p...

SQL Server won't perform regular expression validation on XML column

Hi I have an XML column in my table which contains this xsd snippet: <xsd:element name="Postcode" minOccurs="0"> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:pattern value="^[0-9]{4}$" /> </xsd:restriction> </xsd:simpleType> </xsd:element> The regular expression should require a string...

Why does this Perl regexp fail?

I have the following Perl code: my $progName = shift ; open(IPLAYERLIST, "iplayer-list.html") or die "Cannot open iplayer index file iplayer-list.html\n" ; while (<IPLAYERLIST>) { if ( /($progName)/is ) { #if ( /Just A Minute/is ) { <-- This works! my $iplayerID = $1 ; print "IPlayer program id for $progName is $iplayerID\n" ; ...

Is there a Perl equivalent of Python's re.findall/re.finditer (iterative regex results)?

In Python compiled regex patterns have a findall method that does the following: Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this ...

JavaScript Decimal Point Restriction With RegEx

My Regular Expressions knowledge is next to none but I'm having to have some client-side valiation against a text box which only allows numbers up to two decimal points with no other input. This script was a basis for entering numeric values only, however it needs to be adapted so it can take a decimal point followed by only up to two d...

C# regex replace unexpected behavior

Given $displayHeight = "800";, replace whatever number is at 800 with int value y_res. resultString = Regex.Replace( im_cfg_contents, @"\$displayHeight[\s]*=[\s]*""(.*)"";", Convert.ToString(y_res)); In Python I'd use re.sub and it would work. In .NET it replaces the whole line, not the matched group. What is a quick...

RegEx(?) -- how to parse zip codes out of text?

Hey all, I have a file that contains a mish-mash of cities, states, and zip codes. Example: Munson 11010 Shelter Island Heights. . . .. 11965 Brentwood 11717 Halesite 11743 I need to grab all of the zip codes out of that text. They are only 5 digit (no 5+4), and there are no other numbers besides the zips. It seems like a pre...

Finding values within certain tags using regex

I have a sample string: <num>1.</num> <Ref>véase anomalía de Ebstein</Ref> <num>2.</num> <Ref>-> vascularización</Ref> I wish to make a comma seperated string with the values inside ref tags. I have tried the following: Regex r = new Regex("<ref>(?<match>.*?)</ref>"); Match m = r.Match(csv[4].ToLower()); ...