regex

How can I avoid a specific string pattern from being replaced by Regex.replace ()

I have a string like Pakistan, officially the <a href="Page.aspx?Link=Islamic Republic of Pakistan">Islamic Republic of Pakistan</a> Now I am using System.Text.RegularExpressions.Regex.Replace(inputText, "(\\bPakistan\\b)", "something"); to replace Pakistan outside the tags. But I don't want to replace Pakistan occurrin...

Is there a way to optimise finding text items on a page (not regex)

After seeing several threads rubbishing the regexp method of finding a term to match within an HTML document, I've used the Simple HTML DOM PHP parser (http://simplehtmldom.sourceforge.net/) to get the bits of text I'm after, but I want to know if my code is optimal. It feels like I'm looping too many times. Is there a way to optimise th...

Confusion in JavaScript RegExp ? Quantifier

Hi, May I know the reason of getting the output of the following code as: 1,10,10? Why not it is as: 10, 10? <script type="text/javascript"> var str="1, 100 or 1000?"; var patt1=/10?/g; document.write(str.match(patt1)); </script> ...

ColdFusion - pass regex backreference to function call

Hi, I'm using ColdFusion's reReplace() function for regular expression pattern replacement. I'd like to use a function call for the replacement string, and pass a matched backreference to it. Something like this: <cfset s = "STARTDATE_2010-05-07 00:05:00.0_ENDDATE" /> <cfset s = reReplace(s, "STARTDATE_([\s-.:0-9]*)_ENDDATE", dateAd...

match word '90%' using regular expression

Hi All, I want word '90%' to be matched with my String "I have 90% shares of this company". how can I write regular expression for same? I tried something like this: Pattern p = Pattern.compile("\\b90\\%\\b", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE); Matcher m = p.matcher("I have 90% shares of this company"); while (m.fi...

Refactor Regex Pattern - Java

Hello All, I have the following aaaa_bb_cc string to match and written a regex pattern like \\w{4}+\\_\\w{2}\\_\\w{2} and it works. Is there any simple regex which can do this same ? ...

Yet another URL prefix regex question (to be used in C#).

Hi, I have seen many regular expressions for Url validation. In my case I want the Url to be simpler, so the regex should be tighter: Valid Url prefixes look like: http[s]://[www.]addressOrIp[.something]/PageName.aspx[?] This describe a prefix. I will be appending ?x=a&y=b&z=c later. I just want to check if the web page is live befo...

Extracting email addresses in an html block in ruby/rails

I am creating a parser that wards off against spamming and harvesting of emails from a block of text that comes from tinyMCE (so it may or may not have html tags in it) I've tried regexes and so far this has been successful: /\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b/i problem is, i need to ignore all email addresses with mailto hre...

Match Regex across newlines?

I have a regex ( "(&lt;lof&lt;).*?(&gt;&gt;)" ) that works and matches perfectly on single line input. However, if the input contains newlines between the two () parts it does not match at all. What's the best way to ignore any newlines at all in that case? ...

Regular expression problem (PHP)

Hello all. I have a little problem with my regular expression, that I use in PHP. My code identify all tags of my content and add a link in this image. My code is working when I use dinamycally, without any defined image. When I try with a imapge path, the code does not work. How can I solve this problem? Working code: $content = preg_...

Excluding a specific substring from a regex

I'm attempting to mangle a SQL query via regex. My goal is essentially grab what is between FROM and ORDER BY, if ORDER BY exists. So, for example for the query: SELECT * FROM TableA WHERE ColumnA=42 ORDER BY ColumnB it should capture TableA WHERE ColumnA=42, and it should also capture if the ORDER BY expression isn't there. The closes...

Strip Javascript on(whatever) events from Code using PHP

Hi, I want to strip out all JavaScript from a small snippet (4-6 lines) of HTML, i've read on here before that its best not to use REGEX on HTML, so if anybody knows a better way, please advise. So for example i have the following code: <a href="go/to/my/link" onclick="fetchMeSomeData(this)">My Link</a> <p onfocus="doSomethingAmazing...

Parsing in groovy between two tags ?

I would like to parse this Gstring with groovy : Format type : Key, Value. def txt = """ <Lane_Attributes> ID,1 FovCount,600 FovCounted,598 ... </Lane_Attributes> """ And get a map like : Map = [ID:1, FovCount:600, FovCounted:598] How can ...

Regex to replace 'li' with 'option' without losing class and id attributes

I am looking for a solution using preg_replace or similar method to change: <li id="id1" class="authorlist" /> <li id="id2" class="authorlist" /> <li id="id3" class="authorlist" /> to <option id="id1" class="authorlist" /> <option id="id2" class="authorlist" /> <option id="id3" class="authorlist" /> I think I have the pattern corre...

Use Javascript RegEx to extract column names from SQLite Create Table SQL

I'm trying to extract column names from a SQLite result set from sqlite_master's sql column. I get hosed up in the regular expressions in the match() and split() functions. t1.executeSql('SELECT name, sql FROM sqlite_master WHERE type="table" and name!="__WebKitDatabaseInfoTable__";', [], function(t1, result) { for(i = 0;i < result...

How to split but ignore separators in quoted strings, in python?

I need to split a string like this, on semicolons. But I don't what to split on semicolons that are inside of a string (' or "). I'm not parsing a file; just a simple string with no line breaks. part 1;"this is ; part 2;";'this is ; part 3';part 4;this "is ; part" 5 Result should be: part 1 "this is ; part 2" 'this is ; part 3' part ...

Prohibit ampersand in Rails form

NOT a Rails 3 issue In a Contact model I have a company_name attribute. For reasons that don't matter to this question, I want to prohibit an ampersand character. We have a lot of clients with ampersands in their company name, and users forget they aren't allowed to use this character. This isn't an html sanitize issue. I don't care ab...

simple regex to splice out text in ruby

I'm using ruby and I want to splice out a piece of a string that matches a regex (I think this is relatively easy, but I'm having difficulty) I have several thousand strings that look like this (to varying degrees) my_string = "adfa <b>weru</b> orua fklajdfqwieru ofaslkdfj alrjeowur woer woeriuwe <img src=\"/images/abcde_111-222-333/1...

how to match all group and subgroup in pcre

a ip or other string, like "11.22.33.44" or "aa.bb.cc.dd". basically, I think it is very easy, (([\d\w]+)+\.)+[\d\w]+ but the problem is which group these submatches are in. not like ip, some string is consist of lots of words+separate in pcre, I don't know how to extract it all words -- "aa bb cc dd ..." ...

How to censor IP addresses in a file with Python?

Hello everyone. I have a log file containing some Whois entries with relative IP addresses which I want to censor like: 81.190.123.123 in 81.190.xxx.xxx. Is there a way to make such a conversion and rewrite the file contents without modifying the rest? Thank you for the help! ...