regex

A more efficient Regex or alternative?

I have a file with a little over a million lines. {<uri::rdfserver#null> <uri::d41d8cd98f00b204e9800998ecf8427e> <uri::TickerDailyPriceVolume> "693702"^^<xsd:long>} {<uri::rdfserver#null> <uri::d41d8cd98f00b204e9800998ecf8427e> <uri::TickerDailyPriceId> <uri::20fb8f7d-30ef-dd11-a78d-001f29e570a8>} Each line is a statement. struct S...

Alternatives to Regular Expressions

I have a set of strings with numbers embedded in them. They look something like /cal/long/3/4/145:999 or /pa/metrics/CosmicRay/24:4:bgp:EnergyKurtosis. I'd like to have an expression parser that is Easy to use. Given a few examples someone should be able to form a new expression. I want end users to be able to form new expressions ...

Parse filter expression using RegEx

I have a query filter written in human readable language. I need to parse it and convert to the SQL where clause. Examples are: CustomerName Starts With 'J' becomes CustomerName LIKE 'J%' and CustomerName Includes 'Smi' becomes CustomerName LIKE '%Smi%' The full expression to be parsed may be much more complicated such as C...

Need help fixing a regular expression

I'm using the Java matcher to try and match the following: @tag TYPE_WITH_POSSIBLE_SUBTYPE -PARNAME1=PARVALUE1 -PARNAME2=PARVALUE2: MESSAGE The TYPE_WITH_POSSIBLE_SUBTYPE consists of letters with periods. Every parameter has to consist of letters, and every value has to consist of numerics/letters. There can be 0 or more parameters. ...

Help building a regular expression in python using the re module

Hi guys, im writing a simple propositional logic formula parser in python which uses regular expressions re module and the lex/yacc module for lexing/parsing. Originally my code could pick out implication as ->, but adding logical equivalence (<->) caused issues with the compiled expressions IMPLICATION = re.compile('[\s]*\-\>[\s]*') EQ...

RegEx: HTML whitelist

Being weak on regular expressions, I've been working with them to improve. One concept I've been trying to do is to remove all HTML elements except for a list of allowed ones. I've managed to do the reverse -- remove a specified list of elements: <\/?(strong|em|a)[^>]*> However I want the opposite, and remove every element but. ...

mod_rewrite regex to match only if a certain string does NOT exist

Looking through my server logs, I see that a lot of pages on my site are requesting favicon.ico, favicon.jpg, favicon.png, etc in a variety of different directories. Instead of wading through each page to try to figure out where each incorrect request is coming from, I'm writing some apache redirect rules to change a request for any url...

easy to use Regex creator tool?

I tried some regex tools like Regulator, Regulazy & RegexBuddy. They don't do what I want and they expect the user to know regular expressions. I want a tool for dummies. You tell the tool I need a regex for something like "match anything that ends with the word 'yes' and it contains at least one occurrence of the phrase '/test/'" and i...

PHP Regex Problem - /i caseless not working and weird empty array elements.

Source texts (7): give 4 cars ga 5 cars GA 5 Cars @mustang six exhausts are necessary Give -1 Cars @mustang Give Cars @mustang Give 3 Cars @ford Give 5 Cars @cobra_gt The ones which should be successful ate 1,2,3,6,7 preg_match('/Give (\d+) Cars @(\w+)|GA (\d+) Cars @(\w+)/i', $a->text, $output); print_r($output); produces: Arra...

Python regex: Turn "ThisFileName.txt" into "This File Name.txt"

I'm trying to add a space before every capital letter, except the first one. Here's what I have so far, and the output I'm getting: >>> tex = "ThisFileName.txt" >>> re.sub('[A-Z].', ' ', tex) ' his ile ame.txt' I want: 'This File Name.txt' (It'd be nice if I could also get rid of .txt, but I can do that in a separate operation.) ...

Simple PHP form Validation and the validation symbols

Hi have some forms that I want to use some basic php validation (regular expressions) on, how do you go about doing it? I have just general text input, usernames, passwords and date to validate. I would also like to know how to check for empty input boxes. I have looked on the interenet for this stuff but I haven't found any good tutoria...

JavaScript RegExp objects

Hi! I try to write a simple Markdown parser in JavaScript. Therefore I want to check for the [link content][link id] syntax. I use the following code: data = data.replace( /\[(.*?)\][ ]*\[([0-9]+)\]/g, '<a href="$2">$1</a>' ); It works well, but now I want to do this with a RegExp object. So I set up the following bit of code: var r...

Extracting nested function names from a JavaScript function

Given a function, I'm trying to find out the names of the nested functions in it (only one level deep). A simple regex against toString() worked until I started using functions with comments in them. It turns out that some browsers store parts of the raw source while others reconstruct the source from what's compiled; The output of toSt...

What is the optimal way to replace a series of characters in a string in JavaScript

I am working to improve performance of a function that takes an XML string and replaces certain characters (encoding) before returning the string. The function gets pounded, so it is important to run as quickly as possible. The USUAL case is that none of the characters are present - so I would like to especially optimize for that. As ...

Regular Expressions Tutorials

Does anyone know of a regular expression tutorial that doesn't use a designer? ...

What would be the best way to extract the host portion of a url with regexp?

I'm extracting the host from my url and am getting jammed up by making the last / optional. the regexp needs to be prepared to receive the following: http://a.b.com:8080/some/path/file.txt or ftp://a.b.com:8080/some/path or ftp://[email protected]/some/path or http://a.b.com or a.b.com/some/path and return a.b.com so... (ftp://|http://)...

Return first match of Ruby regex

I'm looking for a way to perform a regex match on a string in Ruby and have it short-circuit on the first match. The string I'm processing is long and from what it looks like the standard way (match method) would process the whole thing, collect each match, and return a MatchData object containing all matches. match = string.match(/reg...

How important is knowing Regexs?

My personal experience is that regexs solve problems that can't be efficiently solved any other way, and are so frequently required in a world where strings are as important as they are that not having a firm grasp of the subject would be sufficient reason for me to consider not hiring you as a senior programmer (a junior is always allow...

What's the cleanest way to extract URLs from a string using Python?

Hi all Although I know I could use some hugeass regex such as the one posted here I'm wondering if there is some tweaky as hell way to do this either with a standard module or perhaps some third-party add-on? Simple question, but nothing jumped out on Google (or Stackoverflow). Look forward to seeing how y'all do this! Jamie ...

Is there a regular expression to find two different words in a sentence?

Is there a regular expression to find two different words in a sentence? Extra credit for an expression that works in MS Visual Studio 2008 :) For example: reg_ex_match(A, B, "A sentence with A and B") = true reg_ex_match(C, D, "A sentence with A and B") = false See also this related question ...