regex

Regexkit lite and iPhone parsing

I've taken the suggestion of some posts here that recommend regexkit lite with a problem I am having with trying to extract a particular URL from a string. The problem is that I'm very lost with the syntax of using it and hoping someone that has used it can give me a hand. The string i'm trying to parse looks someting like this: <a> bl...

Get HTML page <input> values and names using regex on PHP

Ok, so as the title says, I have an HTML page that I fetch using libcurl (cURL inside PHP). That page has one <form> that I need to extract the <input> names and values, and I would like to do that using Regex. I'm making it using Regex because I think that's the easier way. If you think I shouldn't use regex, but something like xpath, s...

How can I convert a complex binary Perl regular expression to C# or PowerShell?

Hello, This Perl binary regex found at http://www.w3.org/International/questions/qa-forms-utf-8.en.php matches UTF-8 documents without the UTF-8 BOM header: $field =~ m/\A( [\x09\x0A\x0D\x20-\x7E] # ASCII | [\xC2-\xDF][\x80-\xBF] # non-overlong 2-byte | \xE0[\xA0-\xBF][\x80-\xBF] # excluding overlongs ...

Regex to replace part of the string with spaces

It seems simple, but I can't get it work. I have a string which look like 'NNDDDDDAAAA', where 'N' is non digit, 'D' is digit, and 'A' is anything. I need to replace each A with a space character. Number of 'N's, 'D's, and 'A's in an input string is always different. I know how to do it with two expressions. I can split a string in to...

Replacing items in an array using two regular expressions

Can you use two regex in preg_replace to match and replace items in an array? So for example: Assume you have: Array ( [0] => mailto:[email protected] [1] => mailto:[email protected] [2] => mailto:[email protected] [3] => mailto:[email protected] [4] => mailto:[email protected] ...

Practicing regex

I'd like to learn regex better so that it becomes a more natural option for me. Often problems that could be solved easily by regex I don't even consider using it. Can someone direct me to a resource that gives challenging regex problems like the one in the python challenge that goes something like this but in more of a riddle like fash...

Regex to extract fields and data types from sql statement

I have this sql statement: CREATE TABLE [dbo].[User]( [UserId] [int] IDENTITY(1,1) NOT NULL, [FirstName] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL, [MiddleName] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_A What i want is regex code which i can use to get all fields and data type. So will return something li...

JavaScript regex exec takes too long to execute

Hi, I've got a simple JavaScript regex check (written by other developer) that works perfectly on thousands of different strings. However I've just discovered one particular string value that's causing that regex to take as long as 10min to execute in Firefox/IE which is unacceptable. I've extracted an actual regex call into small code ...

Mod_rewrite rewriting things it's explicitly told not to

Hi, I have the following rewrites in my .htaccess: Options +FollowSymLinks RewriteEngine On RewriteRule \.(css|jpe?g|gif|png)$ - [L] RewriteRule ^index/error/([^/\.]+)/?$ index.php?error=$1 [L] As you can tell, it's supposed to not rewrite any .css/.jpg/.jpeg/.gif/.png files. Despite that, it's doing so. What's really odd is that ...

How to change this regular expression to be case insenstive (looking for src tag)

Regualar expression: <img[^>]+src\s*=\s*['"]([^'"]+)['"][^>]*> This works fine when 'src' is in lowercase and manages both single and double quotes. I would like this expression to return matches for following test data 1. <html><img src ="kk.gif" alt="text"/></html> 2. <html><img Src ="kk.gif" alt="text"/></html> 3. <html><img sRC ="...

A "smart" (forgiving) date parser?

I have to migrate a very large dataset from one system to another. One of the "source" column contains a date but is really a string with no constraint, while the destination system mandates a date in the format yyyy-mm-dd. Many, but not all, of the source dates are formatted as yyyymmdd. So to coerce them to the expected format, I do (...

Non greedy regex matching in sed?

I'm trying to use sed to clean up lines of URLs to extract just the domain.. e.g., from: http://www.suepearson.co.uk/product/174/71/3816/ I want: http://www.suepearson.co.uk/ (either with or without the trainling slash, it doesn't matter) I have tried: sed 's|\(http:\/\/.*?\/\).*|\1|' and (escaping the non greedy quantifier) ...

Codeigniter route help

I need a codeigniter route so all of the following urls: admin/users/page/:num admin/accounts/page/:num members/results/page/:num products/page/:num are forwarded to admin/users/index admin/accounts/index members/results/index products/index respectively. I'd like just one regexp which could do the trick rather than me setting the ...

REGEX - Find td with specific class, including nested tables

Hi I've to parse over a piece of HTML. It looks a bit like: <table> <tr> <td class="blabla"> <table><tr><td><table><tr><td></td></tr></table></td></tr></table> </td> </tr> <tr> <td class="blabla"> <table><tr><td></td></tr></table> </td> </tr> </table> I need to extract each td with class blabla, but ea...

Pattern based merge help

Is there a generic approach to data merge an xml file content (a template) with embedded XPath expression to an XmlDocument? As an example, (please note this is just a simple example, i am looking for a generic approach) File: <root xmlns:dt="urn:schemas-microsoft-com:datatypes"> <session email='' alias=''> <state> <action> <att...

Error when embedding regular expression (phone numb validate) on xml schema (xsd)

Hi all, I don't understand why this regular expression for validation international phone number gives an error when embedded on xml-schema: <xs:simpleType name="phoneType"> <xs:restriction base="xs:string"> <xs:pattern value="^\+(?:[0-9] ?){6,14}[0-9]$" /> </xs:restriction> </xs:simpleType> What's wrong with it? Does suppo...

How to Match The Inner Possible Result With Regular Expressions

I have a regular expression to match anything between { and } in my string. "/{.*}/" Couldn't be simpler. The problem arises when I have a single line with multiple matches. So if I have a line like this: this is my {string}, it doesn't {work} correctly The regex will match {string}, it doesn't {work} rather than ...

Regex for removing trailing zeros

I am looking for a regular expression (.NET) to remove trailing zeros: 11645766.560000001000 -> 11645766.560000001 10190045.740000000000 -> 10190045.74 1455720.820000000100 -> 1455720.8200000001 etc... I am using regex, over String.Trim(), because the numbers are in one string, actual example: !BEGIN !>>C85.18 POS_LEVEL...

Help with getting values through regex (php)

So yea, I suck with regular expressions. Needs to be done with php. Thanks. I need to be able to pull out "xx" (will always be 2 lowercase alphabetic chars) and "a12" (can be anything but will always be .php). String: http://foo.bar.com/some_directory/xx/a12.php?whatever=youwant ...

What is the point behind character class intersections in Java's Regex?

Java's Regex.Pattern supports the following character class: [a-z&&[def]] which matches "d, e, or f" and is called an intersection. Functionally this is no different from: [def] which is simpler to read and understand in a big RE. So my question is, what use are intersections, other than specifying complete support for CSG-like o...