regex

How to efficiently search/replace on a large txt file?

I have a relatively large csv/text data file (33mb) that I need to do a global search and replace the delimiting character on. (The reason is that there doesn't seem to be a way to get SQLServer to escape/handle double quotes in the data during a table export, but that's another story...) I successfully accomplished a Textmate search an...

Regex ungreedy from the left side (i.e., narrowest match possible from both sides)

Let's say I'm trying to match /dog.*lab/ against this text: "I have a dog. My dog is a black lab. He was created in a laboratory." Greedily, it would match "dog. My dog is a black lab. He was created in a lab". I want to find the smallest possible match from both sides. If I use the ungreedy modifier like /dog.*?lab/ or /dog.*lab/U it...

split: regex to ignore delimiters inside balanced parenthesis

I've got ORDER BY part of the statement and I have to split it at logical parts my $sql_part = <<EOH IF(some_table.some_field <> 3, 1, 0), some_table.other_field, CASE other_table.third_field WHEN IS NULL THEN 2 ELSE 1 END, other_table.third_field EOH and, you know, original string doesn't contain newlines and IF's c...

Converting simple markdown(string) to html with xslt

Im transforming my XSLT-stylesheets into documentation, and I want a rich experience wihtin the commentnodes for each code-chunk, therefor I want to convert the following comment and output as xhtml: String: # This is a title with __bold__ text and *italic* # This is just a normal line - list point with some __bold__ - list point wit...

Parsing tnsnames.ora using regex...

I am attempting to pull some information from my tnsnames file using regex. I started with the following pattern: MYSCHEMA *? = *?[\W\w\S\s]*\(HOST *?= *?(?<host>\w+\s?)\)\s?\(PORT *?= *?(?<port>\d+)\s?\)[\W\w\S\s]*\(SERVICE_NAME *?= *?(?<servicename>\w+)\s?\) which worked fine when MYSCHEMA was the only schema in the file, but when t...

Project Gutenberg Python problem ?

Hello everyone, I am trying to process various texts by regex and NLTK of python -which is at http://www.nltk.org/book-. I am trying to create a random text generator and I am having a hard time with a problem. First, here is my algorithm: Enter a sentence as input -this is called trigger string- Get longest word in trigger string Sear...

How can I extract the columns of data with Perl?

I have strings of this kind NAME1 NAME2 DEPTNAME POSITION JONH MILLER ROBERT JIM CS ASST GENERAL MANAGER I want the output to be name1 name2 and position how can i do it using split/regex/trim/etc and without using CPAN modules? ...

regular expression in js not the same as in php

I have a regular expression to match usernames (which functions in PHP using preg_match): /[a-z]+(?(?=\-)[a-z]+|)\.[1-9][0-9]*/ This pattern matches usernames of the form abc.124, abc-abc.123 etc. However, when I take this to javascript: var re = new RegExp("/[a-z]+(?(?=\-)[a-z]+|)\.[1-9][0-9]*/"); I receive a syntax error: S...

regex: delete white characters

I try to delete more then one white characters from my string: $content = preg_replace('/\s+/', " ", $content); //in some cases it doesn't work but when i wrote $content = preg_replace('/\s\s+/', " ", $content); //works fine could somebody explain why? because when i write /\s+/ it must match all with one or more white character,...

PHP: Needing regex to camelCase a dashed/underscored string & test result against native PHP functions

Hello Folks, I'm trying to streamline this method down. It basically takes in an action (string), makes dashed/underscored strings into camelCase and then tests if the result is equivalent to a native php function, if so it gets an underscore in front. I'm thinking all this could be one regex but I'm not sure how I'd test function_exist...

basic regex help

$text_expression = 'word1 word2 "phrase 1" "phrase 2" -word3 -word4 -"phrase \"hello\" 3" -"phrase 4"'; i want to search strings that contains (word1 OR word2 OR 'phrase 1' OR 'phrase 2') AND doesn't contain (word3 OR word4 OR 'phrase "hello" 3' OR 'phrase 4') what would be the regex expression that is equivalent of $text_expression a...

Problem with regexp trailing space

Hello guys I'm currently modifying Liquid Framework (http://github.com/tobi/liquid) in order to make it support literals. It's all nice and cool but I'm having a slight problem with the regexp I'm using. The following works great, except the fact that it captures the trailing space in $1 "{{{gnomeslab }}}" =~ /^(?:{{{\s?)(.*)(?:}}})$/...

access vba extract phone & fax in address column of table

I have a table with phone & fax data in an "address" column that I want to put into their separate "phone" and "fax" column. The phone comes in a various forms: phone, T , ph, tel., tels., fono. The same issue occurs with fax, i.e. Fax & F. Ideally, I think the following is a good description of the data. Not all numbers pertain to p...

Regular Expression in Visual Studio Find & Replace - multiple spaces between search terms

I require a regular expression for the Visual Studio Search and Replace funtionality, as follows: Search for the following term: sectorkey in ( There could be multiple spaces between each of the above 3 search terms, or even multiple line breaks/carriage returns. The search term is looking for SQL statements that have hard-coded Secto...

Is there a way to do some sort of regex loop in .htaccess

hi at the minute I have a list of rules in my .htaccess file but if I need to add a new page i then need to edit this file again and add yet another rulles. HEre is a few RewriteRule ^admin/(.*).html$ index.php?x=admin&y=$1 RewriteRule ^admin/project/(.*).html$ index.php?x=work&p=project&$1 RewriteRule ^work/(.*).html$ index.php?x=work&...

javascript : extract http urls from a string of multiple urls

Hi All, I have a huge string of multiple URLs in javascript which is of the following format: var urls = "http://..... , http://..... , http://......" I need to extract all URLs from the string into individual variables of part of an array. I cant do a urls.split(",") as some urls seem to have commas in them.Is there a good regex I c...

JavaScript: How to replace 0-9 with a-j

Using JavaScript's replace and regex, how do I replace numbers 0-9 with the letters a-j ? example mapping: 0 = a, 1 = b, 2 = c, 3 = d, 4 = e and so on. so, before: x = 3; after: x = 'd'; ...

Regex: How to Find a Number Surrounded by Whitespace?

Hi, I need help with regex. I'm using Dreamweaver to do so text editing. I need to put quotation marks around each number and separate them by commas. (I'm doing this in order to put the values in my database. In Dreamweaver's Find it's possible to use regexp. I need a regex that finds each number. For example I have the series: 2010 ...

How do I make an arbitrary Perl regex wholly non-capturing? (Answer: You Can't)

How can I remove capturing from arbitrarily nested sub-groups in a a Perl regex string? I'd like to nest any regex into an enveloping expression that captures the sub-regex as a whole entity as well as statically known subsequent groups. Do I need to transform the regex string manually into using all non-capturing (?:) groups (and hope...

Mod_rewrite syntax with query strings

Embarrassing as this may be, I've hit a wall with mod_rewrite trying to come up with what seems to be a simple rule. I'd like to accomplish the following mapping: /cat/subcat which may have a "?PageId=123" afterwards should become /cat.php?cid=148 or (/cat.php?cid=148&PageId=123) So for example, the following 2 mappings would occur:...