regex

error while using regex_replace function from <tr1/regex>

#include <string> #include <tr1/regex> #include "TextProcessing.h" const std::string URL_PATTERN("((http://)[-a-zA-Z0-9@:%_\\+.~#?&amp;//=]+)"); const std::string REPLACEMENT("<a href=\"$&\"\">$&</a>"); std::string textprocessing::processLinks(const std::string & text) { // essentially the same regex as in the previous example, b...

Java regex to match odd file format: {<<"user_1">>, [<<"user_2">>,<<"user_3">>,<<"user_04">>]}.

This is the (non-escaped) regex i'm using so far \{<<"(\w+)">>, \[(<<"(\w+)">>,?)+\]\}. To match this: {<<"user_1">>, [<<"user_2">>,<<"user_3">>,<<"user_04">>]}. And these are the groups I'm getting: 1: user_1 2: <<"user_04">> 3: user_04 Any thoughts on why it isn't giving the multiple users? If you were wondering the file form...

Get first few elements of a html fragment with xpath on ruby

For a blog like project, I want to get the first few paragraphs, headers, lists or whatever within a range of characters from a markdown generated html fragment to display as a summary. So if I have <h1>hello world</h1> <p>Lets say these are 100 chars</p> <ul> <li>some bla bla, 40 chars</li> </ul> <p>some other text</p> And assum...

rename files with python - regex

hello, I am wanting to rename 1k files using python. they are all in the format somejunkDATE.doc basically, I would like to delete all the junk, and only leave the date. I am unsure how to match this for all files in a directory. thanks ...

Get found items as list from regex

Given the following code: var myList = new List<string> { "red", "blue", "green" }; Regex r = new Regex("\\b(" + string.Join("|", myList.ToArray()) + ")\\b"); MatchCollection m = r.Matches("Alfred has a red and blue tie and blue pants."); Is there a way to derive a List<string> of the "found" items ("red", "blue", "blue")? ...

PHP REGEX: Find a dom node based on innerHTML

As I am well aware that PHPDom can solve half of my problem, I'm in need of a way (not necessarily regex) to be able to find a certain DOM element based on a given innerHTML. say for example i got this code: <tr> <td class="ranking_rank" style="vertical-align:middle;">48697</td> <td class="ranking_ign" style="vertical-align:middle;...

How can I compare words using regular expression but ignore certain ones?

I have made a search engine and I am comparing a search string against a list of words. However I want to omit words like "How,do,i". So if the user search for "How do I find my IP?". If I have 3 other items beginning with How do I it wouldn't really be able to return a good relevancy. right now its setup (a-z0-9)+userinput+(a-z0-9) Wan...

Regular expression to remove CSS comments

Dear all, I want to write the regular expression in php for matching the line within a double and single quotes. Actually I am writing the code for removing comment lines in css file. Like: "/* I don't want to remove this line */" but /* I want to remove this line */ Eg: - valid code /* comment */ next valid code "/* not a comme...

Python: use regular expression to remove the white space each line

^(\s+) only removes the whitespace from frist line, how to remove the front whitespace from all the lines? ...

preg_replace in my PHP script doesn't work

I've a user form where I take a phone number as input in one of my fields. I have two seperate RegEx statements checking on the input. First one is: preg_match('/^([\(]{1}[0-9]{3}[\)]{1}[\.| |\-]{0,1}|^[0-9]{3}[\.|\-| ]?)?[0-9]{3}(\.|\-| )?[0-9]{4}$/', $phone); and it works great. It can identify many different formats i.e. 222-333-44...

The best way I know to write readable regex in C and C++

The question is: http://stackoverflow.com/questions/3978351/how-to-avoid-backslash-escape-when-writing-regular-expression-in-c-c Stackoverflow does not allow to answer my own question. So I post it as a faked "question" When I reading [C: A reference manual] Chapter 3: Prepressors. An idea emerges: #define STR(a) #a #define R(var, r...

PHP code to generate safe URL?

We need to generate a unique URL from the title of a book - where the title can contain any character. How can we search-replace all the 'invalid' characters so that a valid and neat lookoing URL is generated? For instance: "The Great Book of PHP" www.mysite.com/book/12345/the-great-book-of-php "The Greatest !@#$ Book of PHP" www.my...

Parsing strings by using RegEx

First let me post you some example strings: string_position = ("\"%s\";\"%s\";\"%s\";\"\";\"%s\"\r\n\"%s\";\"%s\";\"%s\";\"%s - %s\";\"%s\";\"%.0f\";\"FR\";\"%.2f\";\"%.2f\";\"%.2f\";\"%s\";\"%s\";\"%s\";\"%s\";\"\";\"\";\"\";\"\";\"\";\"\";\"\";\"\";\"\";\"\";\"\";\"\";\"\";\"\";\"B\"\r\n",POSNR_NR_ID,POSNR_NR_ID,POSNR,POSNR_NR_ID,ARTN...

Matching exactly one occurance in a string with a regular expression

The other day I sat with a regular expression problem. Eventually I solved it a different way, without regular expressions, but I would still like to know how you do it :) The problem I was having was running svn update via an automated script, and I wanted to detect conflicts. Doing this with or withouth regex is trivial, but it got me...

Why does "Year 2010" =~ /([0-4]*)/ results in empty string in $1 ?

If I run "Year 2010" =~ /([0-4]*)/; print $1; I get empty string. But "Year 2010" =~ /([0-4]+)/; print $1; outputs "2010". Why? ...

How can I split a line when some fields contain spaces?

I have a text file that I extracted from a PDF file. It's arranged in a tabular format; this is part of it: DATE SESS PROF1 PROF2 COURSE SEC GRADE COUNT 2007/09 1 RODRIGUEZ TANIA DACSB 06500 001 A 3 2007/09 1 RODRIGUEZ TANIA DACSB 06500 001 A- 2 2007/09 1 RODRIGUEZ TANIA DACSB 06500 001 B 4 2007/09 1 RODRIGUEZ TANIA DACSB 0...

how to Validate a Phone number so that it should not allow all same numerics like 99999999999 or 11111111111 in java

how to Validate a Phone number so that it should not allow all same numerics like 99999999999 or 11111111111 in JAVA thanks Sunny Mate ...

Regex to check a substring is having all numeric or not (in Java script).

Hi, I am using following code to check first four characters are alphabate or not. var pattern = new RegExp('^[A-Z]{4}'); if (enteredID.match(pattern)) isValidAlphabates = true; else { isValidAlphabates = false; This is working fine; Now I need to check the next 6 characters (of the entered text) need to be only numeric (0 to ...

how to find all the words starting with '$' sign and ending with space, in a long string

In C#, how do I find all the words starting with '$' sign and ending with space, in a long string, using Regular expressions? ...

How to NOT match a word in mod_rewrite

Please help, I'm going crazy! RewriteRule ^([a-z0-9_-]+)?/?search/?$ search.php?id=$1&%{QUERY_STRING} [NC,L] This is my current code. Sometimes people will visit mysite.com/search, other times they will visit mysite.com/boris/search and I detect a user with an empty($_GET['id']) check. However I am creating another search, mysite.com...