regex

python regex question

What is the best way to search for matching words inside a string? Right now I do something like the following: if re.search('([h][e][l][l][o])',file_name_tmp, re.IGNORECASE): Which works but its slow as I have probably around 100 different regex statements searching for full words so I'd like to combine several using a | separator o...

How can I validate a culture code with a regular expression?

Hi, I really don't understand regex and I also can't find any regex rule to validate culture codes as: en-GB, en-UK, az-AZ-Cyrl, others ( http://sharpertutorials.com/list-of-culture-codes/ ) Could someone give me some help? Thank you. ...

Better to use regex or Stringtokenizer to find author and book title in this: William Faulkner - 'Light In August'

Is it better to use regex or Stringtokenizer to separate the author and title in this string: William Faulkner - 'Light In August' Is this the simplest regex that would work? Pattern pattern = Pattern.compile("^\\s*([^-]+)-.*$"); Matcher matcher = pattern.matcher("William Faulkner - 'Light In August'"); String author = matcher.group...

Clojure: get list of regex matches

Perhaps I'm going about this all wrong, but I'm trying to get all the matches in a string for a particular regex pattern. I'm using re-matcher to get a Match object, which I pass to re-find, giving me (full-string-match, grouped-text) pairs. How would I get a sequence of all the matches produced by the Match object? In Clojuresque Pytho...

Does this regex have one or two groups? "^\\s*(.*?)\\s+-\\s+'(.*)'\\s*$"

Does this regex have one or two groups? I'm trying to access the bookTitle using the second group but getting an error: Pattern pattern = Pattern.compile("^\\s*(.*?)\\s+-\\s+'(.*)'\\s*$"); Matcher matcher = pattern.matcher("William Faulkner - 'Light In August'"); String author = matcher.group(1).trim(); String bookTitle = matcher.group...

extract part of file name

If I have a string in the following format: location-cityName.xml how do I extract only the cityName, i.e. a word between - (dash) and . (period)? ...

How can I change my regex to handle data outliers?

Ideally all of my data would look like this: William Faulkner - 'Light In August' William Faulkner - 'Sanctuary' William Faulkner - 'The Sound and the Fury' In that case, this regex would seem to work fine: Pattern pattern = Pattern.compile("^\\s*(.*)\\s+-\\s+'(.*)'\\s*$"); Matcher matcher = pattern.matcher("William Faulkner - 'Light...

How can I ensure that my Python regular expression outputs a dictionary?

I'm using Beej's Python Flickr API to ask Flickr for JSON. The unparsed string Flickr returns looks like this: jsonFlickrApi({'photos': 'example'}) I want to access the returned data as a dictionary, so I have: photos = "jsonFlickrApi({'photos': 'test'})" # to match {'photos': 'example'} response_parser = re.compile(r'jsonFlickrApi\...

Negating Alternation In Regular Expressions

I can use "Alternation" in a regular expression to match any occurance of "cat" or "dog" thusly: (cat|dog) Is it possible to NEGATE this alternation, and match anything that is NOT "cat" or "dog"? If so, how? For Example: Let's say I'm trying to match END OF SENTENCE in English, in an approximate way. To Wit: (\.)(\s+[A-Z][^.]|\s...

validate this format - "HH:MM"

Hi All, I'm trying to use preg_match to validate that a time input is in this format - "HH:MM" ...

C# Retrieving First Instance of Ambigous Pattern

I have a string that I want to parse using regex. It has the follow format: "random text [id value] more text [id value] other stuff" I would like to find the pattern that will match [id value] the brackets included. Do I have to do anything special to return two matches instead of one match. My concern is that I will only return th...

C# Regex Ignoring Escaped Character

I have a string like this that is delimited | and can contain any character in between: "one two|three four five|six \| seven eight|nine" I'd like to find a regex that returns: one two three four five six | seven eight nine I can think about how I want to do this but, I don't know regex well enough. I basically want to match until...

Cucumber Step Definition Regex Exclude Help

I need some help with a regular expression in a Cucumber step definition file. Many of my steps are of the type: Given I am on the search page I use this general pattern for most of my step definitions, and use the default Webrat regex to pick it up that looks like this: Given /^(?:|I )am on (.+)$/ do |page_name| visit path_to(p...

PHP RegEx Match to end of string

Hi all, Still learning PHP Regex and have a question. If my string is Size : 93743 bytes Time elapsed (hh:mm:ss.ms): 00:00:00.156 How do I match the value that appears after the (hh:mm:ss.ms):? 00:00:00.156 I know how to match if there are more characters following the value, but there aren't any more characters after it and I d...

Making Regular Expression more efficient

I'm attempting to determine the end of an English sentence (only approximately), by looking for "!", "?" or ".", but in the case of "." only when not preceeded by common abbreviations such as Mr. or Dr. Is there any way to make the following Regular Expression even marginally more efficient? Perhaps by sorting the negative lookbehinds ...

Regex for finding number of times substring "hello hello" occurs in string "hello hello hello"

I want to get the count of occurrence of a substring within a string. My string is "hello hello hello". I want to get the number of times "hello hello" occurs in it, which in the above case is 2. Can someone please help me find a regex for it? ...

C# Regex, Either Or

I have a string that I parse in regex: "one [two] three [four] five" I have regex that extracts the bracketed text into <bracket>, but now I want to add the other stuff (one, three, five) into <text>, but I want there to be seperate matches. So either it is a match for <text> or a match for <bracket>. Is this possible using regex? ...

Using Python's re to swap case.

I'm using the re library to normalize some text. One of the things I want to do is replace all uppercase letters in a string with their lower case equivalents. What is the easiest way to do this? ...

vim replace character to \n

I need replace all ; to \n , but :%s/;/\n/gc not works ...

Extract content from each first TD in a Table

I've got some HTML that looks like this: <tr class="row-even"> <td align="center">abcde</td> <td align="center"><a href="deluserconfirm.html?user=abcde"><img src="../images/delete_x.gif" alt="Delete User" border="none" /></a></td> </tr> <tr class="row-odd"> <td align="center">efgh</td> <td align="center"><a href="deluser...