regex

How to do multiple regular expressions, each time refining the results?

Why can't I output my regex to a variable, and then run regex on it a second time? I'm writing a greasemonkey javascript that grabs some raw data, runs some regex on it, then runs some more regex on it to refine the results: // I tried this on :: http://stackoverflow.com/ var tagsraw = (document.getElementById("subheader").innerHTML)...

Regex Do Not Match

I can't figure out how to match comments but not html hex in regex. For example I want the script to match #I'm a comment, yes I am but not #FF33AF ...

Regex with < and >

ok i have a file that may or may not be newlined or carriage fed. frankly i need to ignore that. I need to search the document find all the < and matching > tags and remove everything inside them. I've been trying to get this to work for a bit my current regex is: private Regex BracketBlockRegex = new Regex("<.*>", RegexOptions.Singleli...

What does the regular expression "\d{1,6}" check for?

What does the regular expression "\d{1,6}" (used in an ASP.NET MVC route as parameter constraint) check for/allow? ...

How can I extract the aliases for shell configuration files?

I need to write a script to show me all the alias I've set in certain config files like .tcshrc and some private script in case that I forget the meaning of alias like "a" , "b" , etc . the format of the aliases in the config file is either alias a "content of alias a" or alias b 'content of alias b' . the code I wrote is as below : ...

Check String for Alphabetical Characters

If I have the strings "hello8459" and "1234", how would I go about detecting which one had the alphabetical characters in? I've been trying: //Checking for numerics in an if... Pattern.matches("0-9", string1); However it doesn't work at all. Can anyone advise? ...

Beginner Regex: Multiple Replaces

I have a string: $mystring = "My cat likes to eat tomatoes."; I want to do two replacements on this string with regex. I want to do s/cat/dog/ and s/tomatoes/pasta/. However, I don't know how to properly format the regular expression to do the multiple replacements in one expression, on one line, in one declaration. Right now, all I h...

regular expression check for closing tag XHTML

The end of my expression is the only part causing me problems, im trying to match > not /> something like this: \s*[^\/]> however i dont want to match any other characters before the > Here is an example, I want this to match any img tags that are not closed. <img((\s+\w+(\s*=\s*(?:".*?"|'.*?'|[^'">\s]+))?)+\s*|\s*)\s*[^\/]> ...

ruby regular expression begin method a bit confusing

m = /(.)(.)(\d+)(\d)/.match("THX1138.") puts m[0] c = m.captures #=> HX1138 puts c[0] #=> H puts m.begin(0) #=> 1 puts c[1] #=> X puts m.begin(1) #=> 1 puts c[2] #=> 113 puts m.begin(2) #=> 2 I was expecting m.begin(1) to return 2 since X is two elements after the beginning of string. I am reading the book well grounded...

How can I efficiently match many different regex patterns in Perl?

I have a growing list of regular expressions that I am using to parse through log files searching for "interesting" error and debug statements. I'm currently breaking them into 5 buckets, with most of them falling into 3 large buckets. I have over 140 of patterns so far, and the list is continuing to grow. Most of the regular express...

To do RegEx, what are the advantages/disadvantages to use UTF-8 string instead of unicode?

Usually, the best practice in python, when using international languages, is to use unicode and to convert early any input to unicode and to convert late to a string encoding (UTF-8 most of the times). But when I need to do RegEx on unicode I don't find the process really friendly. For example, if I need to find the 'é' character follow...

register_printf_function in PHP

I need to let the user specify a custom format for a function which uses vsprintf, and since PHP doesn't have glibc' register_printf_function(), I'll have to do it with PCRE. My question is, what would be the best REGEXP to match % followed by any character and not having % before it, in an usable manner for programmatic use afterwards?...

Regex to find bad URLs in a database field

We had an issue with the text editor on our website that was doubling up the URL. So for example, the text field may look contain: This is a description for a media item, and here in <a href="http://www.example.com/apage.htmlhttp://www.example.com/apage.html"&gt;a link</a>. So pretty much I need a regex to detect any string that begi...

PHP preg_match - what's wrong with this RegEx???

The values will be in this format 123-123-123-12345 that I would like the preg_match to work for. Can you see anything wrong with this regEx? foreach($elem as $key=>$value) { // Have tried this With and without the + before the $ as well if(preg_match("/^[0-9]{3}\-[0-9]{3}\-[0-9]{3}\-[0-9]{5}+$/", $value)) { echo "Yeah matc...

Searching an array of different strings inside a single string in PHP.

I have an array of strings that I want to try and match to the end of a normal string. I'm not sure the best way to do this in PHP. This is sorta what I am trying to do: Example: Input: abcde Search array: er, wr, de Match: de My first thought was to write a loop that goes through the array and crafts a regular expr...

URLRewriter.net regex for querystrings

I have this url: http://www.site.com/products-book_1.aspx?promo=free Here is what I have in my web.config for the UrlRewriter rules: <rewrite url="~/products-(.+)_(\d+).aspx" to="~/product.aspx?pid=$2" /> What can I add to this expression to also retrieve the promo value? The end result would be http://www.site.com/products.a...

Regular expression removing all words shorter than n

Well, I'm looking for a regexp in Java that deletes all words shorter than 3 characters. I thought something like \s\w{1,2}\s would grab all the 1 and 2 letter words (a whitespace, one to two word characters and another whitespace), but it just doesn't work. Where am I wrong? ...

E-mail format REGEX PHP- there are many like it but this one is mine

Uh, and it's broken: I had a perfectly working regex that allowed all the numbers, letters and only e-mail relevant punctuation (._-@) to sanitize my email fields, and then I thought it would be nice adding a proper email regex, checking for the correct pattern. This is what I have now: function check_chars_email($str) { $str_replace =...

Java Regular Expression value.split("\\."), The Forward Slash Dot divides by character?

From what I understand, the backslash dot ("\.") means one character of any character right? So because backslash is an escape it should be backslash backslash dot ("\\.") What does this do to a string. I just saw this in an existing code I am working on. From what I understand it will split the string into individual characters. Why do...

Regular Expression in Java: How to refer to "matched patterns" ?

I was reading the Java Regular Expression tutorial, and it seems only to teach to test whether a pattern matched or not, but does not tell me how to refer to a matched pattern. For example, I have a string "My name is xxxxx". And I want to print xxxx. How would I do that with Java regular expressions? Thanks. ...