regex

Parsing stored procedures from a file

I have the following regex: Regex defineProcedureRegex = new Regex(@"\s*(\bcreate\b\s+\bprocedure\b\s+(?:|dbo\.))(\w+)\s+(?:|(.+?))(as\s+(?:.+?)\s+\bgo\b)\s*", RegexOptions.Compiled | RegexOptions.IgnoreCase | RegexOptions.RightToLeft | RegexOptions.Singleline); that I am running against a SQL script file containing multiple "create p...

How can I check if the next line is blank in Perl?

I just asked a question about how to check if the current line is blank or not in Perl. That works for the current line, but how do I check to see if the next line is blank? Text file to parse:(i need parse the text file and create a new XML file) constant fixup GemEstabCommDelay = <U2 20> vid = 6 name = "ESTABLISHCOMMUNICATIO...

Very complicated regular expression

I've been stuck trying to write this regular expression I need. Basically, I have a long string comprised of two different types of data: [a-f0-9]{32} [a-zA-Z0-9=]{x} The thing is, x is only constant in the particular instance: if in one case, it happens to be 12, it will be 12 for that particular dataset, but next time I run the reg...

How to pass parameter with preg_replace() with 'e' modifier ?

I have a question about preg_replace() function. I'm using it with 'e' modifier. Here is code snippet: $batchId = 2345; $code = preg_replace("/[A-Za-z]{2,4}[\d\_]{1,5}[\.YRCc]{0,4}[\#\&\@\^]{0,2}/e", 'translate_indicator(\'$0\', {$batchId})', $code); I want to have access to $batchId variable inside translate_indi...

problem with regxp

My php script should validiate address of websites, that user type into the form. Adress should look like this: http://example.com/example/{some numbers}/ or http://example.com/example/{some numbers} And I have think about something like this, but it doesn't work: /^(http:\/\/)?example\.com\/example\/\d{1}(\/?)$/ Can you show me wh...

Python regexp find two keywords in a line

I'm having a hard time understanding this regex stuff... I have a string like this: <wn20schema:NounSynset rdf:about="&dn;synset-56242" rdfs:label="{saddelmageri_1}"> I want to use findall() and groups to get this: ['56242','saddelmageri'] I can match the number with something like "synset-[0-9]" and the word with something like "...

Replacing a sequence of characters with regular expressions

If I have the string: ababa and I want to replace any "aba" sequence with "c". How do I do this? The regular expression "aba" to be replaced by "c" doesn't work as this comes out as "cba". I want this to come out as "cc". I'm guessing this is because the second "a" in the input string is being consumed by the first match. Any idea how t...

How to match a 9 or 14 digits long number with regex?

Hi, I need to check by Regex expression if 9 or 14 digits are typed. The expression "\d{9}|\d{14}" seems to be not working properly, what's wrong ? ...

Existence of obvious maximum length of Look-behind group in Java

In this Java code: public class Main { public static void main(String[] args) { "".matches("(?<!((.{0,1}){0,1}))"); } } the compiler (I'm using JVM 1.6.0_17-b04) shouts "Exception ... Look-behind group does not have an obvious maximum length". I saw here that: Java takes things a step further by allowing finite repetition. ...

Regex for all strings not containing a string?

Ok, so this is something completely stupid but this is something I simply never learned to do and its a hassle. How do I specify a string that does not contain a sequence of other characters. For example I want to match all lines that do NOT end in '.config' I would think that I could just do .*[^(\.config)]$ but this doesn't work ...

When strip_tags() Burns a Haystack

I've got a list of websites for each US Congress member that I'm programmatically crawling to scrape addresses. Many of the sites vary in their underlying markup, but this wasn't initially a problem until I started seeing that hundreds of sites were not giving the expected results for the script I had written. After taking some more tim...

Searching Hpricot with Regex

I'm trying to use Hpricot to get the value within a span with a class name I don't know. I know that it follows the pattern "foo_[several digits]_bar". Right now, I'm getting the entire containing element as a string and using a regex to parse the string for the tag. That solution works, but it seems really ugly. doc = Hpricot(open("ht...

regular expressions - same for all languages?

is the regexp the same between languages? for example. if i want to use it in javascript, would i have to search for regexp for javascript specifically. cause i got some cheat sheets. it just says regular expression. i wonder if i could use this on all languages, php, javascript and so on. ...

Testing with multiple regexps at the same time (for use in syntactic analysis)

I am writing a simple syntax highlighter in JavaScript, and I need to find a way to test with multiple regular expressions at the same time. The idea is to find out which comes first, so I can determine the new set of expressions to look for. The expressions could be something like: /<%@/, /<%--/, /<!--/ and /<[a-z:-]/ First I tri...

Java String split regex not working as expected.

The following Java code will print "0". I would expect that this would print "4". According to the Java API String.split "Splits this string around matches of the given regular expression". And from the linked regular expression documentation: Predefined character classes . Any character (may or may not match line terminators) Therefor...

Find a sequence with regular expression then find a second one next or few line to it.

I will better explain my situation with an example. Considering a httpd.conf file in which I need to change the document root. So first I need to find the ServerName then change the document root, so I believe here I need two regexp but I m not sure how to do it?Can someone please help?Or do I just need to find the ServerName and make a...

Properly Matching a IDN URL

I need help building a regular expression that can properly match an URL inside free text. scheme One of the following: ftp, http, https (is ftps a protocol?) optional user (and optional pass) host (with support for IDNs) support for www and sub-domain(s) (with support for IDNs) basic filtering of TLDs ([a-zA-Z]{2,6} is enough I thi...

Java Regular Expression

Hey all, I need to remove some characters at the end of a certain list of item. These characters are always the same (C, CD, PDF, CPDF, M) and with this regular expression I'm able to get rid of them : str.replaceAll("(C|CD|PDF|CPDF|M)$", ""); However, I'm not able to inverse this expression : I'd like to be able to isolate (by remo...

Include one file to another

I'm looking for very simple template script for building JS files. It should do only one thing: include one file to another. Template (main.js) /*> script.js */ var style = "/*> style.css */"; script.js var my_script; style.css html, body {margin:0; padding:0} .my-style {background: #fffacc} Output var my_script; var style =...

regEx replace in Excel with VB

Hello All, I really dislike making Excel Macro's :-( Here is the typical string I am working with: ·         Identify & document site-related constraints and assumptions. I would like to scrub that string to get rid of everything before "Identify"... I wrote a function to take the string and scrub it, here it is: Function d...