regex

How to find the url using the referer and the href in Python?

Suppose I have window_location = 'http://stackoverflow.com/questions/ask' href = '/users/48465/jader-dias' I want to obtain link = 'http://stackoverflow.com/users/48465/jader-dias' How do I do it in Python? It have to work just as it works in the browser ...

Help with a regex that strips out leading white space.

I am modifying a core function of the Kohana library, the text::auto_p() function. The function describes itself as "nl2br() on steroids". Essentially, it provides <br /> single line breaks, but double line breaks are surrounded with the <p> tags. The limitation I have found with it is that it will but <br />s in a <pre> element. This ...

Sed Out of a Pair of Words

One can use a command such as the following to substitute the word "before" to the word "after" where "before" occurs between the pair of words "begin" and "end": sed '/begin/,/end/ {s/before/after/g}' I am looking for a way to substitute "before" by "after" only if they do not occur inside a pair of "begin" and "end". ...

How to match this strings with Regex?

Basically I have music filenames such as: <source> <target> "Travis - Sing" "Travis - Sing 2001.mp3" "Travis - Sing" "Travis - Sing Edit.mp3" "Travis - Sing" "Travis - Sing New Edit.mp3" "Mission Impossible I" "Mission Impossible I - Main Theme.mp3" "Mission Impossible I" "Mission Impossible II - Main Theme.mp3" "Mesrine - De...

Rewrite URL-string with String.replace in Actionscript 3

Hello, I'm getting a string that looks like this from a database: ~\Uploads\Tree.jpg And I would like to change it in Actionscript3 to Uploads/Tree.jpg Any idea how I can do this in neat way? ...

Regex routers in zend framework,how to make merge these routers?

Hi there, I need to know how can I merge these routers into one? I want to have just one router instead of these ones. I appreciate any answer.:) $route = new Zend_Controller_Router_Route_Regex( '([a-z]{2})/(\w+)/(\w+)/(\w+)', array('controller'=>'index', 'action' => 'index', 'module'=>'default', 'lang'=>$lang )...

PHP preg_replace problem

I've run into a hard problem to deal with. I am replacing a-tags and img-tags to fit my suggestions like this. So far so good. $search = array('|(<a\s*[^>]*href=[\'"]?)|', '|(<img\s*[^>]*src=[\'"]?)|'); $replace = array('\1proxy2.php?url=', '\1'.$url.'/'); $new_content = preg_replace($search, $replace, $content); Now my problem is tha...

Tools for data mining hand-written html

I need to convert a large website from static html written entirely by humans into proper relational data. First there comes a large amount of tables (not necessarily the same for every page), then code like this: <a name=pidgin><font size=4 color=maroon>Pidgin</font><br></a> <font size=2 color=teal>Author:</font><br> <font size=2>Sean ...

Regex Performance Optimization Tips and Tricks

After reading a pretty good article on regex optimization in java I was wondering what are the other good tips for creating fast and efficient regular expressions? ...

Python and web-tags regex

Hello, i have need webpage-content. I need to get some data from it. It looks like: < div class="deg">DATA< /div> As i understand, i have to use regex, but i can't choose one. I tried the code below but had no any results. Please, correct me: regexHandler = re.compile('(<div class="deg">(?P<div class="deg">.*?)</div>)') result = ...

PHP preg_replace() on multiple items

This is what I have so far: <?php $text = preg_replace('/((\*) (.*?)\n)+/', 'awesome_code_goes_here', $text); ?> I am successfully matching plain-text lists in the format of: * list item 1 * list item 2 I'd like to replace it with: <ul> <li>list item 1</li> <li>list item 2</li> </ul> I can't get my head around wrapping <ul> ...

How to escape a string for use in Boost Regex

I'm just getting my head around regular expressions, and I'm using the Boost Regex library. I have a need to use a regex that includes a specific URL, and it chokes because obviously there are characters in the URL that are reserved for regex and need to be escaped. Is there any function or method in the Boost library to escape a strin...

Regex browser version match

I have a string: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 (.NET CLR 3.5.30729) I want to know what version of Firefox is in the string (3.5.2). My current regex is: Firefox\/[0-9]\.[0-9]\.[0-9] and it returns Firefox/3.5.2 I only want it to return 3.5.2 from the Firefox version, not...

What is a good regular expression tester for OS X?

I'm looking for a GUI based RegExp tester in the vein of rubular.com, or this javacript expression tester here, for OS X, to help me when writing regular expressions. It would be really handy for it to work in more than one language (i.e. Python, Javascript, or Ruby). Other than using MacVim's own find as you type tool, or a commandlin...

Retrieve Html attributes using Regex

I'm in need of a quick way to put a bunch of html attributes in a Dictionary. Like so <body topmargin=10 leftmargin=0 class="something"> should amount to attr["topmargin"]="10" attr["leftmargin"]="0" attr["class"]="something" This is to be done server-side and the tag contents are already available. I just need to weed out the tags w...

preg_replace_callback - do twice

Yo, i'm trying to do this script working, but it doesn't work. How do i do it twice, the preg_replace_callback with two different functions. Thanks! function prepend_proxy($matches) { $url = (substr($_GET['url'], 0, 7) == 'http://') ? $_GET['url'] : "http://{$_GET['url']}"; $prepend = $matches[2] ? $matches[2] : $url; $prepe...

regular expression for password

Please can some one give me regular expression for password with the following rules. Password should be at least 7 characters long. It should contain minimum 3 digits and one alphabetic character. Password can accept numbers, alphabets, special characters any number of times except numbers should be minimum 3. ...

Regular Expression to get pascal functions

I have a pascal code file and need to parse it (using c#) and display all the public functions, my file looks something like that (not actual code): public function Test(str: string):bool; function Test1(str: string):bool; function Test2(str,str1,str2,str3 str4: string):bool; function Test3(str: string):bool; pu...

sed: Prepend every line with the hold space.

I'm trying to convert a large number of files from a plain text layout to CSV. The first few lines of one of the files looks like this: SLICE AT X= -0.25 ELEM NO XI-COORD INWARD-NORMAL 1 0 0.000 0.000 0.000 0.000 0.000 0.000 2 0 0.000 0.000 0.000 0.000 0.000 0.000 3 ...

Emacs-style Regex in Info-reader?

I am a Vim-user lost in the Emacs-style Regex of Info-reader. I want to match: $ info find ?How-in-Info-reader? :%s#\(\\;.*\\+\)\|\(\\+.*\\;\)#WORKS!#g INFO: "C-X n" to go through the matches I am looking for the Emacs-counterpart for the Vim-command marked with "?How-in-Info-reader?". How can you find the matche...