questions about regex | ansaurus

regex

xpath expression for regex-like matching?

I want to search div id in an html doc with certain pattern. I want to match this pattern in regex: foo_([[:digit:]]{1.8}) using xpath. What is the xpath equivalent for the above pattern? I'm stuck with //div[@id="foo_ and then what? If someone could continue a legal expression for it. EDIT Sorry, I think I have to elaborate more...

Using Java to find substring of a bigger string using Regular Expression

If I have a string like this: FOO[BAR] I need a generic way to get the "BAR" string out of the string so that no matter what string is between the square brackets it would be able to get the string. e.g. FOO[DOG] = DOG FOO[CAT] = CAT ...

string-manipulation

i want to capture a string through regex only when it contains any alphabetic characters?

example strings 785*()&!~`a ##$%$~2343 455frt&*&* i want to capture the first and the third but not the second since it doesnt contain any alphabet character plz help ...

How can I find the count of semicolon separated values?

I have a list of all email ids which I have copied from the 'To' field, from an email I received in MS Outlook. These values (email ids) are separated by a semicolon. I have copied this big list of email ids into Excel. Now I want to find the number of email ids in this list; basically by counting the number of semi colons. One way I ca...

What does the "?:^" regular expression mean?

I am looking at this sub-expression (this is in JavaScript): (?:^|.....) I know that ? means "zero or one times" when it follows a character, but not sure what it means in this context. ...

Finding anchor text when there are tags there

I want to find the text between a pair of <a> tags that link to a given site Here's the re string that I'm using to find the content: r'''(<a([^<>]*)href=("|')(http://)?(www\.)?%s([^'"]*)("|')([^<>]*)>([^<]*))</a>''' % our_url The result will be something like this: r'''(<a([^<>]*)href=("|')(http://)?(www\.)?stacko...

Rewriteengine in .htaccess to catch files not ending in html

I'd like to use mod rewrite in to convert web page addresses like /directory to /directory/index.html, in a standard LAMP hosting situation. What I have works for addresses that end in a slash. I can't find a way to handle addresses that don't end a slash. What seems like it should work is: rewriterule ^(.*)/$ $1/index.html [L] /* addr...

How do I avoid the implicit "^" and "$" in Java regular expression matching?

I've been struggling with doing some relatively straightforward regular expression matching in Java 1.4.2. I'm much more comfortable with the Perl way of doing things. Here's what's going on: I am attempting to match /^<foo>/ from "<foo><bar>" I try: Pattern myPattern= Pattern.compile("^<foo>"); Matcher myMatcher= myPattern.matcher(...

Interesting test of Javascript RegExp

I wrote a Javascript RegExp test to detect date string format, I added an redundant "g" flag by mistake and found something interesting. var s = "2009/03/10"; var regex=/^\d{4}[/]\d{2}[/]\d{2}$/g; alert(regex.test(s)); alert(regex.test(s)); alert(regex.test(s)); alert(regex.test(s)); I got a 'true' followed by a 'false', then another...

Compiled replace regular expression

Hi! I'd like to build a regular expression assembly of common regex I must use in my project. I use these regular expressions to match a pattern and to replace it. I use this piece of code who builds the assembly. AssemblyName an = new AssemblyName("MyRegExp"); RegexCompilationInfo[] rciList = { new RegexCompilationInfo(@"\<b\>(....

How to retrieve the text between two html markup with c# ?

How to retrieve the text between two html markup with c# ? Edit : This is the only purpose of my question, "how to retrieve the string within two html markup using c#", that's all. ...

Regex to find the code blocks in C#

I have to find the code blocks from the given code using Regex in C#. e.g. I have to find the For loop block from the following code For A in 1..10 Loop stmt1; For C in cur_op Loop stmt2; end loop; end loop; For T in 4..8 loop stmt3; end loop; I want to retrieve the code blocks as For A in 1..10 Loop stmt1; For C in cur_op Loop st...

python regex trouble

I have the following code : what = re.match("get|post|put|head\s+(\S+) ",data,re.IGNORECASE) and in the data variable let's say I have this line : GET some-site.com HTTP/1.0 ... If I stop the script in the debugger, and inspect the what variable, I can see it only matched GET. Why doesn't it match some-site.com ? ...

Multiple Regex.Replace or or'ed pattern?

Hi! Again a regex question. What's more efficient? Cascading a lot of Regex.Replace with each one a specific pattern to search for OR only one Regex.Replace with an or'ed pattern (pattern1|pattern2|...)? Thanks in advance, Fabian ...

replace a set of characters with another set of chars (in pair): "&", "&" "<", "<" etc. in regex

I have to encode the 5 XML reserved chars (& < > " and ') properly as follows: "&", "&" "<", "<" ">", ">" "\"", """ "\'", "'" I can do them one by one, but is it possible in regexp something like ("[&|<|>|\"|\']", "&|<"); ... etc so that it will not be executed in 5 operations one after another but alltogether s...

How can I debug a regular expression in python?

Is there a way to debug a regular expression in Python? And I'm not referring to the process of trying and trying till they work :) EDIT: here is how regexes can be debugged in perl : use re 'debug'; my $str = "GET http://some-site.com HTTP/1.1"; if($str =~/get\s+(\S+)/i) { print "MATCH:$1\n"; } The code above produces the foll...

Escaping -> and => when parsing HTML using regular expression

Hi All, I need to parse and return the tagname and the attributes in our PHP code files: <ct:tagname attr="attr1" attr="attr2"> For this purpose the following regular expression has been constructed: (\<ct:([^\s\>]*)([^\>]*)\>) This expression works as expected but it breaks when the following code is parsed <ct:form/input ty...

Python parsing

I'm trying to parse the title tag in an RSS 2.0 feed into three different variables for each entry in that feed. Using ElementTree I've already parsed the RSS so that I can print each title [minus the trailing )] with the code below: feed = getfeed("http://www.tourfilter.com/dallas/rss/by_concert_date") for item in feed: print rep...

replace urls

hi there, I have a huge txt file and Editpad Pro list of urls with images on the root folder. http://www.othersite.com/image01.jpg http://www.mysite.com/image01.jpg http://www.mysite.com/category/image01.jpg How can I change only that ones that has images on the root using regexp? http://www.othersite.com/image01.jpg http://www.NEW_...

Regex replace, but only between two patterns

Ok, I have a multi-line string I'm trying to do some clean-up on. Each line may or may not be part of a big block of quoted text. Example: This line is not quoted. This part of the line is not quoted “but this is.” This one is not quoted either. “This entire line is quoted” Not quoted. “This line is quoted and so is this one and so is ...

search-and-replace

1
...
54
55
56
57
58
...
613