regex

In a Java regex with Quantifiers how can I obtain the matched groups?

I am processing text using Java Regexes (1.6) which contain quantifiers and I wish to return the number and values of matched groups. A simple example is: A BC DEF 1 23 456 7 XY Z which is matched by: ([A-Z]+){0,9} (\d+){0,9} ([A-Z]+){0,9} How can I find the number of each capture (here 3 4 2) and the values ("A", "BC", "DEF", "1",...

how to concatenate regular expressions in javascript?

this is not what i'm asking for: http://stackoverflow.com/questions/680446/concatenate-multiple-regexes-into-one-regex/680454 is there a way to append a regex into another one (in javascript language) ? the reason for doing so is to simplify code maintenance (e.g. if the included string is long or easier to maintain by a non-programme...

Highlight whole words, omit HTML

I am writing some C# code to parse RSS feeds and highlight specific whole words in the content, however, I need to only highlight words that are outside HTML. So far I have: string contentToReplace = "This is <a href=\"test.aspx\" alt=\"This is test content\">test</a> content"; string pattern = "\b(this|the|test|content)\b"; string o...

I have two problems, one of them is a regex

I am updating some code that I didn't write and part of it is a regex as follows: \[url(?:\s*)\]www\.(.*?)\[/url(?:\s*)\] I understand that .*? does a non-greedy match of everything in the second register. What does ?:\s* in the first and third registers do? Update: As requested, language is C# on .NET 3.5 ...

Why won't this regex match?

It's supposed to match the text inside any h1, h2, or h3 tags. preg_match("<[hH][1-3][^>]*>(.*?)<[hH][1-3]>", $text, $matches); echo $matches[0]; But it never catches any. ...

Regex default value if not found

Hi, I would like to supply my regular expression with a 'default' value, so if the thing I was looking for is not found, it will return the default value as if it had found it. Is this possible to do using regex? ...

Java regex help

I'm trying to write a regular expression for Java's String.matches(regex) method to match a file extension. I tried .*.ext but this doesn't match files ending in .ext just ext I then tried .*\.ext and this worked in a regular expression tester but in Eclipse I am getting an invalid escape sequence error. Can anyone help me with this? Tha...

Postgres: regex and nested queries something like Unix pipes

Command should do: Give 1 as output if the pattern "*@he.com" is on the row excluding the headings: user_id | username | email | passhash_md5 | logged_in | has_been_sent_a_moderator_message | was_last_checked_by_moderator_at_time | a_moderator ---------+----------+-----------+----------------------------------...

How to group in regex

I have this input string(oid) : 1.2.3.4.5.66.77.88.99.10.52 I want group each number into 3 to like this Group 1 : 1.2.3 Group 2 : 4.5.66 Group 3 : 77.88.99 Group 4 : 10.52 It should be very dynamic depending on the input. If it has 30 numbers meaning it will return 10 groups. I have tested using this regex : (\d+.\d+.\d+) But th...

Why doesn't this regex work?

The regex: ^ *x *\=.*$ means "match a literal x preceded by an arbitrary count of spaces, followed by an arbitrary count of spaces, then an equal sign and then anything up to the end of line." Sed was invoked as: sed -r -e 's|^ *x *\=.*$||g' file However it doesn't find a single match, although it should. What's wrong with the reg...

Please help with regex statement for names like o'brian or macdonald

I admit regex is a strange world and I have not been able to really get my head wraped around it. But I have this problem that I believe belongs in the regex world. i would like to change last names like "o'brian" to "O'Brian" or "macdonald" to "MacDonald" or "who-knew" to "Who-Knew" or "who knew" to "Who Knew" so far all I have is ......

python time format check

Hello At python, I want to check if the input string is in "HH:MM" such as 01:16 or 23:16 or 24:00. Giving true or false by the result. How can I achieve this by using regular expression ? ...

Parse 'family' names into people + last name with regex

Given the following string, I'd like to parse into a list of first names + a last name: Peter-Paul, Mary & Joël Van der Winkel (and the simpler versions) I'm trying to work out if I can do this with a regex. I've got this far (?:([^, &]+))[, &]*(?:([^, &]+)) But the problem here is that I'd like the last name to be captured in ...

Why is my matcher failing?

I am passing a string into my song parser method and it is failing and I can't figure out why. Every thing is returning null or 0. My parser method is public static Song parseSong(String songString){ Map<String, String> songMap = new HashMap<String, String>(); Pattern pattern = Pattern.compile(".*<key>(.+)</key><(.+)>(.+)</.+>.*...

Pitfalls of automated file versioning?

I'm working on a file management system and would like to include an automated versioning such as bates numbering if a file with the same name exists. I thought of inserting a "-v0001" between the filename and extension and counting the number of versions as they come in. $basename = pathinfo($filename, PATHINFO_BASENAME); $fname = pat...

How to match URL in c#?

I have found many examples of how to match particular types of URL-s in PHP and other languages. I need to match any URL from my C# application. How to do this? When I talk about URL I talk about links to any sites or to files on sites and subdirectiories and so on. I have a text like this: "Go to my awsome website http:\www.google.pl\s...

In Python, how to check if a string only contains certain characters?

In Python, how to check if a string only contains certain characters? I need to check a string containing only a..z, 0..9, and . (period) and no other character. I could iterate over each character and check the character is a..z or 0..9, or . but that would be slow. I am not clear now how to do it with a regular expression. Is this ...

Is there a way to have a capture repeat an arbitrary number of times in a regex?

I'm using the C++ tr1::regex with the ECMA regex grammar. What I'm trying to do is parse a header and return values associated with each item in the header. Header: -Testing some text -Numbers 1 2 5 -MoreStuff some more text -Numbers 1 10 What I would like to do is find all of the "-Numbers" lines and put each number into its own re...

How to extract citations from a text (PHP)?

Hello! I would like to extract all citations from a text. Additionally, the name of the cited person should be extracted. DayLife does this very well. Example: “They think it’s ‘game over,’ ” one senior administration official said. The phrase They think it's 'game over' and the cited person one senior administration official sho...

Why doesn't the .* consume the entire string in this Perl regex?

Why doesn't the first print statement output what I expect: first = This is a test string, sec = This is a test string Since both * and + are greedy, why does the the inner * i.e. inside the "((" in the first match not consuming the entire string? use strict; use warnings; my $string = "This is a test string"; $string =~ /((.*)*)/; ...