regex

PHP - how to extract link anchor text with a certain keyword from a string

I want to extract the url of all links in a string with certain anchor text. I saw a previously post on doing this in javascript - can anyone help me do this in PHP? http://stackoverflow.com/questions/369147/javascript-regex-to-extract-anchor-text-and-url-from-anchor-tags ...

javascript strip words from url

i got a little book marklet javascript:window.location='http://alexa.com/siteinfo/'+window.location.host; when clicked it takes you to alexa.com/siteinfo/www.thesiteyouwereon.com i dont know a lot about js regex..is there a way to get rid of the www. from the begining of the site. so i can get alexa.com/siteinfo/nowwwsite.com thank...

Why will this recursive regex only match when a character repeats 2^n - 1 times?

After reading polygenelubricants's series of articles on advanced regular expressions techniques (particularly How does this Java regex detect palindromes?), I decided to attempt to create my own PCRE regex to parse a palindrome, using recursion (in PHP). What I came up with was: ^(([a-z])(?1)\2|[a-z]?)$ My understanding of this expr...

Verifying a CSV file is really a CSV file

I want to make sure a CSV file uploaded by one of our clients is really a CSV file in PHP. I'm handling the upload itself just fine. I'm not worried about malicious users, but I am worried about the ones that will try to upload Excel workbooks instead. Unless I'm mistaken, an Excel workbook and a CSV can still have the same MIME, so ch...

Help me write a Regex RewriteRule for htaccess

I need help writing a simple regex for RewriteRule for mod_rewrite in htaccess. So, here is what I am trying to accomplish: books/2010-the-world-by-hopkins-139_PPS-1234567 should go to index.php?pagename=mypage&PPS=1234567&description=2010-the-world-by-hopkins-139 So, in pseducode, the regex has to split the part after books by _ a...

regex retrieve part of a string encapsulated in ()

I just simply want to retrieve two substring of a string which take the following form. The first is a numerical value (there is only one) which is encased between parentheses, such as (12345) - this can be any number of digits (I haven't really kept any stats on length) but between 1 and 10 digits should cover it. The second substring...

What Does This RegEx Do???

I've just inherited a projected with a lot of client side validation. In this I have found a regex checker, as expected without a comment in sight. I'll admit regex is definitely one of my failing points. (In Javascript!) var myRegxp = /(([0]*[1-9]{1})|([1]{1}[0-2]{1}))(\/{1})((19[0-9]{2})|([2-9]{1}[0-9]{3}))$/; if (!myRegxp.test(args....

How do you remove parentheses words within a string using NSRegularExpression?

I am not too familiar with regex and so I been having some getting this to work with Apple's NSRegularExpression I am trying to remove words in parentheses or brackets... For example: NSString *str = @"How do you (remove parentheses words) within a string using" resulting string should be: @"How do you within a string using" Than...

Regex: Possible combine list operator [a-z] with " "{2, } in one expression? Just letters a-z and not more than one whitespace consecutivly

I'm searching for a regular expression that let me replace all chars but letters and digits and one whitespace consecutively. For example: string = "a b c e f g 1 2 3 !" should be replaced in ruby to "a b c e f g 1 2 3 " matching letters and digits is not that problem with [a-zA-Z0-9] with the list operator. but how to combi...

preg_replace of "Word" in a sentence and "Word." on the end of a sentence

Hi, i want to preg_replace "Word" in PHP. $ret = 'I gave my Word to you.'; $pattern = '/\bWord\b/i'; $ret = preg_replace($pattern,"Heart",$ret); // echo $ret; = "I gave my Heart to you"; This works so far. But if the sentence is "I gave you my Word." or "I gave you my Word!" it doesn't change the "Word." into "Heart....

How can I remove all font tags that have nothing but whitespace in them, using Perl?

I'm trying to do a match in Perl, using the following regex: s/<font(.*?)>[\t\f ]*<\/font>//gi; What I want it to do is to remove all font tags that don't have anything inside. Unfortunately, it doesn't stop after <font at the first > it will go until the > from before </font>. Any pointers on what is wrong with the regex? my $text...

Etags: generate tag for Objective-C methods declaration

How to make etags generate tags for both the declaration (i.e. inside the @interface block) and the definition (i.e. inside the @implementation block)? The default behavior is only to generate tags for the definition. I've already tried to invoke etags with --declarations but that didn't solve the issue. A way would be to pass a custom ...

counterintuitive slow Python regex search performance when providing a start of a String char

I've written a Python utility to scan log files for known error patterns. I was trying to speed up the search by providing the regex engine with additional pattern info. For example, not only that I'm looking for lines with "gold", I require that such line must start with an underscore, so: "^_.*gold" instead of "gold". As 99% of the lin...

Regular Expressions Algorithm

Given a sub-string, is there a way to generate all the possible regular expressions (most restrictive to least restrictive) that would match that sub-string for a given string? For example, say you have a sub-string "orange" and a string "apple banana orange grape". How would I get a list of regexes that match "orange" (I know there wil...

regular expression word boundary

I'm using a simple regular expression to match on the start of words, using the word boundary matcher, like /(\b)rice/ will match on "years of rice and salt" but not "maurice ravel" and so on. However, I'm finding a ! at the start of the string is negating the word boundary matcher. So the string "!!" is matching on "some text!!". A...

RegEx to match pair of Rectangular Brackets using javascript

Hi, Can someone help me with a Javascript regular expression? I need to match pairs of brackets. For example, it should match "[abc123]", "[123abc]" in the following string: "this is a test [abc123]], another test [[123abc]. This is an left alone closing" Thanks in advance. ...

Idiomatic Scala

I'm trying to get my Scala code to be a bit more idiomatic. Right now it just looks like Java code. I'm trying to do a simple boolean regex matching function in Scala, since I cannot seem to find it in the standard library(?) I don't think the result is particularly nice with the try-catch and all. Also, a precondition is that 'patt' h...

preg_match help please

preg_match_all('/[\s]{1}(AA|BB|CC)+[\s]{1}/',' AA BB ',$matches); result is AA, but i need AA and BB... please help, sorry for bad english. ...

Regular Expression to match <a> tags without http://

how to match html "a" tags, only the ones without http, using regular expression? ie match: blahblah... < a href=\"somthing\" > ...blahblah but not blahblah... < a href=\"http://someting\" > ...blahblah ...

max age with nginx/passenger/memcached/rails2.3.5

I notice that in my production enviornment (where I have memcached implemented) in see a cache-control - max-age header in firebug, anytime I am looking at an index page (posts for example). Cache-Control max-age=315360000 In my dev environment that header looks like following. Cache-Contro private, max-age=0, must-revalidate As...