regex

Regex Help

I have a text file which all contain the following fragments of code in it Lines.Strings = ( '[email protected]' '[email protected]' '[email protected]' '[email protected]') the e-mail address will change. as will the Lines part of Lines.String it can be called anything EG Test.Strings, or ListBox.Strings I want to match all and any text...

Regex to find strings contained between separators

in this text : text text text [[st: aaa bbb ccc ddd eee fff]] text text text text [[st: ggg hhh iii jjj kkk lll mmm nnn]] text text text I'm trying to get the text between the [[st: and that ends with ]] My program should output: aaa bbb ccc ddd eee fff (first match) ggg hhh iii jjj kkk \n lll mmm nnn(second match) But I can on...

How would I modify this regex to extract the left and right hand parts of a UK postal code?

I have a regular expression which works for validating UK postal codes but now I would like to extract the constituent parts of the code and I'm getting confused. For those who do not know examples of UK postal codes are 'WC1 1AA', 'WC11 1AA' and 'M1 1AA'. The regular expression below (apologies for the formatting) handles the lack of a...

Printing Certain Pages in MS Word

Using MS Word, is there a way to simply print only those pages which contain a certain search string. For example, I have a few hundred pages of transaction summaries and there is a certain string that reoccurs through out the transaction report. I can't throw a regular expression into the pages to print dialog or something? ...

Who can crack this twitter regexp?

I would like to grab all the hashtags using PHP from http://search.twitter.com/search.atom?q=%23eu-jele%C4%A1%C4%A1i The hashtags are in the content, title nodes within the RSS feed. They are prefixed with # The problem I am having is with non-English letters (outside of the range a-zA-Z). If you look at the RSS feed and then view th...

RegEx for replacing and adding attributes to an HTML tag

Given the following code : <body> <img src="source.jpg" /> <p> <img src="source.jpg" id ="hello" alt="nothing" /> <img src="source.jpg" id ="world"/> </p> </body> What's the best way - using a regular expression (or better?) - to replace it so it becomes this: <body> <img src="source.jpg" id="img_0" /> <p> <img ...

Does VBscript have modules? I need to handle CSV

I have a need to read a CSV file, and the only language I can use is VBscript. I'm currently just opening the file and splitting on commas, and it's working OK because there aren't any quoted commas in fields. But I'm aware this is an incredibly fragile solution. So, is there such a thing as a VBscript module I can use? Somewhere to ge...

Regex for matching javadoc fragments

This question got me thinking in a regex for matching javadoc comments that include some specified text. For example, finding all javadoc fragments that include @deprecated: /** * Method1 * ..... * @deprecated * @return */ I manage to get to the expression /\*\*.*?@deprecated.*?\*/ but this fails in some cases like: /** * Method1 * ...

upper- to lower-case using sed

Hi, I'd like to change the following patterns: getFoo_Bar to: getFoo_bar (note the lower b) Knowing neither foo nor bar, what is the replacement pattern? I started writing sed 's/\(get[A-Z][A-Za-z0-9]*_\)\([A-Z]\)/\1 but I'm stuck: I want to write "\2 lower case", how do I do that? Maybe sed is not adapted? ...

How can I convert URLs to Markdown syntax, but NOT interfere with URLs already in Markdown syntax?

A system I am writing uses Markdown to modify links, but I also want to make plain links active, so that typing http://www.google.com would become an active link. To do this, I am using a regex replacement to find urls, and rewrite them in Markdown syntax. The problem is that I can not get the regex to not also parse links already in Ma...

Using WebClient in C# is there a way to get the URL of a site after being redirected?

Using the WebClient class I can get the title of a website easily enough: WebClient x = new WebClient(); string source = x.DownloadString(s); string title = Regex.Match(source, @"\<title\b[^>]*\>\s*(?<Title>[\s\S]*?)\</title\>", RegexOptions.IgnoreCase).Groups["Title"].Value; I want to store the URL and the page title. Ho...

preg for validating URL

...

Scanner vs. StringTokenizer vs. String.Split

I just learned about Java's Scanner class and now I'm wondering how it compares/competes with the StringTokenizer and String.Split. I know that the StringTokenizer and String.Split only work on Strings, so why would I want to use the Scanner for a String? Is Scanner just intended to be one-stop-shopping for spliting? ...

Compression HTTP Module that can escape inline scripts

I have HTTP module that compresses HTTP request. public override void Write(byte[] buffer, int offset, int count) { byte[] data = new byte[count]; Buffer.BlockCopy(buffer, offset, data, 0, count); string html = System.Text.Encoding.Default.GetString(buffer); Regex reg = new Regex(@"(?<=[^])\t{2,}|(?<=[>])\s{2,}(?=[<])|(...

Regular Expression to match only odd or even number

I have a list of textual entries that a user can enter into the database and I need to validate these inputs with Regular Expressions because some of them are complex. One of fields must have gaps in the numbers (i.e., 10, 12, 14, 16...). My question is, is there a Regex construct that would allow me to only match even or odd digit runs?...

Seeking comparison table for different regexes

I use vim, sed, bash and Perl. Each has somewhat different regex syntax. I just spent time finding that I need to escape the curly parens in sed, but not in BASH (when using them as counter elements). Grrr. Can anybody point me to a table that summarizes the differences between the different regex parsers in these 4 environments. TIA...

Cut off the filename and extension of a given string.

I build a little script that parses a directory for files of a given filetype and stores the location (including the filename) in an array. This look like this: def getFiles(directory) arr = Dir[directory + '/**/*.plt'] arr.each do |k| puts "#{k}" end end The output is the path and the files. But I want only the path. Inste...

How do I extact the first element in a list using replaceregex in an Ant file?

Working with Ant's regular expressions system seems to give me no end of trouble. With enough work I can usually get it to work (and understand what I was doing wrong earlier). But not this time. I have a simple target wherein I want to extract the first element out of a property that contains one or more comma separated words, like this...

How to parse substring between last set of parentheses in string in ruby.

In my ruby on rails app, I am trying to build a parser to extract some metadata out of a string. Let's say the sample string is: The quick red fox (frank,10) jumped over the lazy brown dog (ralph, 20). I want to extract the substring out of the last occurence of the ( ). So, I want to get "ralph, 20" no matter how many ( ) are ...

How to get offset of a Regex capture ?

I'm trying get the offset of a regex capture in .NET just like .IndexOf() would return. Is there anyway to do that? ...