regex

Regex to Strip Special Characters

I am trying to use regex.replace to strip out unwanted characters, but I need to account for spaces: string asdf = "doésn't work?"; string regie = @"([{}\(\)\^$&._%#!@=<>:;,~`'\’ \*\?\/\+\|\[\\\\]|\]|\-)"; Response.Write(Regex.Replace(asdf,regie,"").Replace(" ","-")); returns doésntwork instead of doésnt-work Ideas? Thanks! ...

Parse and extract required text from text files using C#

I have some text files with some useful data wrapped in between HTML tags like <td>, <span>, etc. I want to write a program which extracts the data in between the tags. The text file contains other junk data too. I would also like to store these extracted data into SQL Table. Anyone who can guide me in right direction? ...

Java doesn't work with regex \s, says: invalid escape sequence

hi, I want to replace all whitespace characters in a string with a "+" and all "ß" with "ss"... it works well for "ß", but somehow eclipse won't let me use \s for a whitespace.. I tried "\t" instead, but it doesn't work either.. I get the following error: Invalid escape sequence (valid ones are \b \t \n \f \r \" \' \ ) this is my...

regex to get current page or directory name?

I am trying to get the page or last directory name from a url for example if the url is: http://www.example.com/dir/ i want it to return dir or if the passed url is http://www.example.com/page.php I want it to return page Notice I do not want the trailing slash or file extension. I tried this: $regex = "/.*\.(com|gov|org|net|mil|edu)/...

Detect WebKit Version 525 and Below With RegEx

I'm no good at Regular Expressions, really! I would like to specifically detect WebKit browsers below version 525. I have a regular expression [/WebKit\/[\d.]+/.exec(navigator.appVersion)] that correctly returns WebKit/5….…, really, I'd like it to return only the version number, but if the browser isn't WebKit, return null, or better s...

How do I do an "OR" for my python regex?

re.compile("abc") I would like to do "abc" OR "xyz". ...

Match e-mail addresses not contained in HTML tag

I need to highlight an email addresses in text but not highlight them if contained in HTML tags, content, or attributes. For example, the string [email protected] must be converted to <a href="mailto:[email protected]">[email protected]</a> But email addresses in the string <a href="mailto:[email protected]">[email protected]...

.NET RegEx "Memory Leak" investigation

I recently looked into some .NET "memory leaks" (i.e. unexpected, lingering GC rooted objects) in a WinForms app. After loading and then closing a huge report, the memory usage did not drop as expected even after a couple of gen2 collections. Assuming that the reporting control was being kept alive by a stray event handler I cracked op...

Backreferences in lookbehind

Can you use backreferences in a lookbehind? Let's say I want to split wherever behind me a character is repeated twice. String REGEX1 = "(?<=(.)\\1)"; // DOESN'T WORK! String REGEX2 = "(?<=(?=(.)\\1)..)"; // WORKS! System.out.println(java.util.Arrays.toString( "Bazooka killed the poor aardvark (yummy!)" .sp...

Why String.replaceAll() don't work on this String ?

//This source is a line read from a file String src = "23570006,music,**,wu(),1,exam,\"Monday9,10(H2-301)\",1-10,score,"; //This sohuld be from a matcher.group() when Pattern.compile("\".*?\"") String group = "\"Monday9,10(H2-301)\""; src = src.replaceAll("\"", ""); group = group.replaceAll("\"", ""); Stri...

What is $1 in Perl?

What is the $1? Is that the match found for (\d+)? $line =~ /^(\d+)\s/; next if(!defined($1) ) ; $paperAnnot{$1} = $line; ...

PHP regex extract date

i have $date variable 2009-04-29 which is Y-m-d anybody can give idea how to extract into $d, $m, $y using simplest method as possible? regex is preferable. any more suggestion with simple method will be chosen. :) ...

How to Pregreplace {number}) with \n{number})

How can i replace {number}) with \n{number}) Say i have something like this 1) test string testing new string. 2) that is also a new string no new line. 3) here also no new lines. The output should be something like this 1) test string testing new string. 2) that is also a new string no new line. 3) here also no new lines. How c...

how to match all language characters like english, greek, chinese except the special characters

I have a display name field which i have to validate using JavaScript regex. We have to match all language characters like chinese, german, spanish in addition to english language characters except special characters like *(). I am struck on how to match those non-latin characters. Any help appreciated. ...

javascript regex validation

Is there any way to find which input character fails the regex pattern. For ex: consider [A-Za-z\s.&] is only allowable but the user enter like "test/string" where '/' invalidates input. How to find who fails regex (our case '/') ...

How to use regex to match ASTERISK in awk

I'm stil pretty new to regular expression and just started learning to use awk. What I am trying to accomplish is writing a ksh script to read-in lines from text, and and for every lines that match the following: *RECORD 0000001 [some_serial_#] to replace $2 (i.e. 000001) with a different number. So essentially the script read ...

Conditional Regular Expressions

I'm using Python and I want to use regular expressions to check if something "is part of an include list" but "is not part of an exclude list". My include list is represented by a regex, for example: And.* Everything which starts with And. Also the exclude list is represented by a regex, for example: (?!Andrea) Everything, but no...

Flip numbers in a string

I have an issue with some Arabic text where I need to flip numbers inside a string. So this: "Some text written in 1982 by someone with m0123456 or 12-to-13" Should become: "Some text written in 2891 by someone with m6543210 or 21-to-31" A regex solution will be great. The more optimized for large strings the better. Any hints? ...

Removing contiguous duplicate lines in vi without sorting

This question already addresses how to remove duplicate lines, but enforces that the list is sorted first. I would like to perform the remove contiguous duplicate lines step (i.e. uniq) without first sorting them. Example before: Foo Foo Bar Bar Example after: Foo Bar ...

Why "\d+{1,4}(?:[.,]\d{1,4})?" in RegularExpressionValidator throws Exception: "Nested quantifier {"

I have asp:RegularExpressionValidator with ValidationExpression="\d+{1,4}(?:[.,]\d{1,4})?" but it doesn't' work, parser throws ArgumentException: parsing "\d+{1,4}(?:[.,]\d{1,4})?" - Nested quantifier {. Where is my mistake? I want to allow strings like xxxx,xxxx - from 1 to 4 digits and decimal digits are not required, e.g.: 100...