regex

Find Hyperlinks in Text using Python (twitter related)

How can I parse text and find all instances of hyperlinks with a string? The hyperlink will not be in the html format of test but just http://test.com Secondly, I would like to then convert the original string and replace all instances of hyperlinks into clickable html hyperlinks. I found an example in this thread: http://stackoverflo...

In VB.NET, Way to RegEx match a Number without a prefix tag?

1 <span class='Txt9Gray'>Decisions ( </span> I'm trying to grab the '1' from this string. Before the '1' is another span, but I can't use that as a marker because it can change from page to page. Is there any regex expression that can simply grab the '1'. The word 'Decisions' will always exist. That's my main way to find this lin...

Standardized email regex

Why is there not a standardized email regex? I was recently involved in a project where we had a hickup where the email passed our email regex but failed when creating the MailMessage object. A small error but it had rather big consequences Is the MailMessage constructor using a email regex when checking if an em...

Insert commas into number string

Hey there, I'm trying to perform a backwards regular expression search on a string to divide it into groups of 3 digits. As far as i can see from the AS3 documentation, searching backwards is not possible in the reg ex engine. The point of this exercise is to insert triplet commas into a number like so: 10000000 => 10,000,000 I'm th...

Replacing Tags with Includes in PHP with RegExps

I need to read a string, detect a {VAR}, and then do a file_get_contents('VAR.php') in place of {VAR}. The "VAR" can be named anything, like TEST, or CONTACT-FORM, etc. I don't want to know what VAR is -- not to do a hard-coded condition, but to just see an uppercase alphanumeric tag surrounded by curly braces and just do a file_get_cont...

ANTLR Grammar for Java Regular Expression syntax.

I'm currently working on a testing framework for regular expressions, and I need to be able to parse Java regular expressions into ASTs to be able to generate sample strings which match the given regex. I looked at the implementation of java.util.regex.Pattern but the code looks quite unwieldy (the emphasis was on speed over readability...

How to determine if a File Matches a File Mask?

I need to decide whether file name fits to file mask. The file mask could contain * or ? characters. Is there any simple solution for this? bool bFits = Fits("myfile.txt", "my*.txt"); private bool Fits(string sFileName, string sFileMask) { ??? anything simple here ??? } ...

Best way to chop a signature off an email body

Hello, I am parsing out some emails. Mobile Mail, iPhone and I assume iPod touch append a signature as a separate boundary, making it simple to remove. Not all mail clients do, and just use '--' as a signature delimiter. I need to chop off the '--' from a string, but only the last occurrence of it. Sample copy hello, this is some e...

Matching text in HTML without contents of the tag

I am looking for a regex for Javascript to search for text ("span" for example) in HTML. Example: <div>Lorem span Ipsum dor<a href="blabla">lablala</a> dsad <span>2</span> ... </div> BUT only the "span" after "Lorem" should be matched, not the <span> tag. For a second example, if we search for "bla", only the bold text should be ...

Regular Expressions in C

I'm curious, does anybody know a good way to do regular expression matching in C? The only way I can think of is through Flex. Is this the only way or is there a better way? Thanks! ...

Finding all *rendered* images in a HTML file

Hi all, I need a way to find only rendered IMG tags in a HTML snippet. So, I can't just regex the HTML snippet to find all IMG tags because I'd also get IMG tags that are shown as text in the HTML (not rendered). I'm using Python on AppEngine. Any ideas? Thanks, Ivan ...

OutOfMemoryException in Regex Matches when processing large files

I've got an exception log from one of production code releases. System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown. at System.Text.RegularExpressions.Match..ctor(Regex regex, Int32 capcount, String text, Int32 begpos, Int32 len, Int32 startpos) at System.Text.RegularExpressions.RegexRunner.Init...

library for converting regular expressions to NFAs?

Is there a good library for converting Regular Expressions into NFAs? I see lots of academic papers on the subject, which are helpful, but not much in the way of working code. My question is due partially to curiosity, and partially to an actual need to speed up regular expression matching on a production system I'm working on. Althou...

PHP - Regex for prepending table names within SQL

I am looking for an unobtrusive way to find and replace table names based on their position in an SQL query. Example: $query = 'SELECT t1.id, t1.name, t2.country FROM users AS t1, country AS t2 INNER JOIN another_table AS t3 ON t3.user_id = t1.id'; I essentially need to prepend client name abbreviations to table names and then have m...

Finding methods in source code using regular expressions

I have a program which looks in source code, locates methods, and performs some calculations on the code inside of each method. I am trying to use regular expressions to do this, but this is my first time using them in C# and I am having difficulty testing the results. If I use this regular expression to find the method signature: ((pr...

Regular expression for hidden files under unix.

I'm looking for a regex to match every file begining with a "." in a directory. I'm using CMake (from CMake doc : "CMake expects regular expressions, not globs") and want to ignore every file begining with a dot (hidden files) BUT "\..*" or "^\..*" doesn't work :( The strange thing : this works (thanks to rq's answer) and remove every ...

How do I create a recursive replacement in mod_rewrite?

I'm fairly new to mod_rewrite and I am attempting to convert a URL from http://example.com/foo/bar/blah/etc.html into http://example.com/stuff/foo_bar_blah_etc.html The assumption is that there is not a set number of directories between the domain and the file name therefore I cannot just write a single rewrite rule with 3 placehol...

How to get empty href query parameters?

Hi all, I need a function to get only the empty href query parameter names so I can replace them later with values from another array. After hours of failing at regular expressions, here is what i resorted to: /** * getEmptyQueryParams(URL) * Input: URL with href params * Returns an array containing all empty href query parameters. */ f...

Which Regular Expression Algorithm does Javascript use for Regex?

I was reading this article today on two different regular expression algorithms. According to the article old Unix tools like ed, sed, grep, egrep, awk, and lex, all use what's called the Thompson NFA algorithm in their regular expresssions... However newer tools like Java, Perl, PHP, and Python all use a different algorithm for thei...

How can I get my text parsing regex to recognize a STOP token?

I have written a regular expression for text parser as following: ^\s*(?<Count>[0-9]+)\s*[x|X]\s*(?<Currency>[^\d\sN]?)(?<ParVal>\d+)(?<Type>[^Nn]?)\s.*$ to capture the pattern such as: .... 1 x 1p 1 x 3p .... It works well, however I don't know how to detect whenever 1 x 1p pattern is above/bellow the STOP line in regular expres...