regex

Improve performance of a regexp

My software allows users to use regexp to prepare files. I am in the process of adding a default regexp library with common expressions that can be re-used to prepare a variety of formats. One common task is to remove crlf in specific parts of the files, but not in others. For instance, this: <TU>Lorem Ipsum</TU> <SOURCE>T...

extract all email addresses from some .txt documents using ruby

Hi all, I have to extract all email addresses from some .txt documents. These emails may have these formats: [email protected] {a, b, c}@abc.edu some other formats including some @ signs. I choose ruby for my first language to write this program, but i don't know how to write the regex. Would someone help me? Thank you! ...

RegExp for matching three letters, but not text "BUY"

Hi all! I have two buttons on form, one of the buttons contain currency code (EUR, USD, GBP,CHF,..) and another one - trade direction (BUY or SELL). And some utility recognize buttons by it's text. To recognize button with currencies, I use Regular expression ":[A-Z]{3}", but it don't work properly when second button contain text "BUY"...

Regular Expression help

I apologize, but I'm terrible at understanding regular expressions. Could anyone help me with a simple problem? I need one regex to match everything before and one to match everything after a certain character, like a colon. foo:bar Something to match 'foo' and something to match 'bar'. ...

Validate class/method names with regex

Heya guys. I'm currently working on an MVC Style framework for a company and for security reasons I need to make sure that the controller / method that's passed via the Query String is valid chars to the RFC (which I can't find). I need to be able to validate / sanitize class names according to what's allowed by the PHP interpreter Fo...

Regular expression generator/reducer?

I was posed an interesting question from a colleague for an operational pain point we currently have, and am curious if there's anything out there (utility/library/algorithm) that might help automate this. Say you have a list of literal values (in our cases, they are URLs). What we want to do is, based on this list, come up with a sing...

Modifying captured tokens from a regular expression?

Using a general regular expression replacement (for me, I'm doing this through TextMate) is it possible to modify a captured token? I've essentially got a handful of enums that I want to modify... CONSTANT get { return 1; } CONSTANT get { return 2; } CONSTANT get { return 3; } What I'd like to do is capture the "return x"... return ...

JavaScript/JQuery string replace

Hi, I have the following string which is a CSS selector: #downloads > ul > li:nth-of-type(1) > ul > li:nth-of-type(3) > a This CSS selector works fine in FireFox, CHrome and Safari but IE 6 does not support the nth-of-type selector. The CSS selectors I am working with are generated by Nokogiri and I can not change them. Threw testi...

Newbie to RegEx..

I have this sample string : &Lt;! [If Gte Mso 9]>&Lt;Xml> &Lt;Br /> &Lt;O:Office Document Settings> &Lt;Br /> &Lt;O:Allow Png/> &Lt;Br /> &Lt;/O:Off... And I would like to target specifically anything that begins in an "" and ends in a ">", and replace it with no-space "". Been using Rubular, but I'm having a tricky time learning ...

how to use regex negation string

hi can any body tell me how to use regex for negation of string? I wanna find all line that start with public class and then any thing except first,second and finally any thing else. for example in the result i expect to see public class base but not public class myfirst:base can any body help me please?? ...

Pattern matching problem in C#

Hi there. I have a string like "AAA 101 B202 C 303 " and I want to get rid of the space between char and number if there is any. So after operation, the string should be like "AAA101 B202 C303 ". But I am not sure whether regex could do this? Any help? Thanks in advance. ...

Regular expression not matching what I think it should...

In python, I'm compiling a regular expression pattern like so: rule_remark_pattern = re.compile('access-list shc-[(in)(out)] [(remark)(extended)].*') I would expect it to match any of the following lines: access-list shc-in remark C883101 Permit http from UPHC outside to Printers inside access-list shc-in extended permit tcp object-g...

How to match and capture a regular expression in C#

I've got a regular expression like this something (.*) someotherthing How do I use a Regex object to match the entire expression and get the value of the capture? ...

Allow literal . in regular expression

^[a-zA-Z]:{1}/(\w+/)+$ I want to allow . as well in the expression in \w+. How can I do this? ...

Regular expression to search for a string1 that is never followed by string2

How to construct a regular expression search pattern to find string1 that is not followed by string2 (immediately or not)? For for instance, if string1="MAN" and string2="PN", example search results would be: "M": Not found "MA": Not found "MAN": Found "BLAH_MAN_BLEH": Found "MAN_PN": Not found "BLAH_MAN_BLEH_PN": Not found "BLAH_MAN_B...

Optimize "tagging" regex

I currently use this piece of code to reduce a given text to a valid "tagging" format (only lowercase, a-z and minus allowed) by removing/replacing invalid characters $zip_filename = strtolower($original); $zip_filename = preg_replace("/[^a-zA-Z\-]/g", '-', $zip_filename); //replace invalid chars $zip_filename = ...

VB.Net Regex Help

I've got 3 or 4 patterns that I'm comparing user input against and I need to figure out if the user input matches one of the patters and to return the match if it does. Since the input is multiline, I'm passing each individual line like so: Dim strRawInput() As String = Split(txtInput.Text, vbCrLf) Dim strInput As String t...

Matching a string that starts with a specific character.

I am trying to get some Regex to match and string that begins with ~ and ends with a space or the end of the line. It's part of a Wiki Converter I'm cobbling together... I need to wrap anything that starts in ~ upto the next space (or EOL) in tags. Example strings are: "~Test" // matches Test "~----" // matches ----...

C# regular expression to strip all but alphabetical and numerical characters from a string?

I've been scratching my head trying to figure out how to use Regex.Replace to take an arbitrary string and return a string that consists of only the alpha-numeric characters of the original string (all white space and punctuation removed). Any ideas? ...

Regex to match Domain.CCTLD

Does anyone know a regular expression to match Domain.CCTLD? I don't want subdomains, only the "atomic domain". For example, docs.google.com doesn't get matched, but google.com does. However, this gets complicated with stuff like .co.uk, CCTLDs. Does anyone know a solution? Thanks in advance. EDIT: I've realized I also have to deal with...