regex

Regular expression doesn't work as expected...

How can it be that this regular expression also returns strings that have a _ underscore as their last character? It should only return strings with alphabetical characters, mixed lower- and uppercase. However, the regular expression returns: 'action_' $regEx = '/^([a-zA-Z])[a-zA-Z]*[\S]$|^([a-zA-Z])*[\S]$|^[a-zA-Z]*[\S]$/'; ...

Scala - replaceAllIn

First off, I'm new to Scala. I'm trying to make a template parser in Scala (similar to Smarty (PHP)). It needs to search through the document, replacing anything inside "{{ }}" tags, with anything provided in the HashMap. I'm currently stuck here: import scala.collection.mutable.HashMap import scala.io.Source class Template(filename:...

BBCode, preg_replace, and named capturing groups

What I thought was going to be an easy implementation of two lines of code and a function, turned out to be made of fail. On my webpage, I want to be able to type [text]1[/text], and what it will do is pull the title of that ID. function textFormat($text) { $raw = array( '\'\[text\](?P<id>.*?)\[/text\]\'is' ); ...

Is the lazy version of the 'optional' quantifier ('??') ever useful in a regular expression?

I cannot think of a situation where I'd want to use '??' in a regular expression, but maybe I'm not thinking hard enough. ...

Finding malformed XHTML with Javascript

Does anyone have a good way of finding if a string contains malformed XHTML using Javascript? Since my page allows 'user' generated XHTML returns (the users can be trusted) and injects it into the DOM, I want a way to check if there are unclosed or overly closed tags, and encode them as < and > so that it will simply display the e...

Regular Expression to find src from IMG tag.

Hi, I have a web page. From that i want to find all the IMG tags and get the SRC of those IMG tags. What will be the regular expression to do this. Some explanation: I am scraping a web page. All the data is displayed correctly except the images. To solve this, wow i have an idea, to find the SRC and replace it : e.g /images/header...

What is does the regular expression /^\s*$/ do?

What does this expression in Perl programming do? $variable =~ /^\s*$/; ...

PHP/Javascript RegExp - Non-capturing group

Hi all, I have three variations of a string: 1. view=(edit:29,30) 2. view=(edit:29,30;) 3. view=(edit:29,30;x:100;y:200) I need a RegExp that: capture up to and including ",30" capture "x:100;y:200" - whenever there's a semicolon after the first match; WILL NOT include leftmost semicolon in any of the groups; entire string on the r...

regex to turn URLs into links without messing with existing links in the text

Hi there I am trying to convert URLs in a piece of text into hyperlinks - using regular expressions. I have managed to achieve this but the problem is when there are already existing links in the text so bla bla blah www.google.com bla blah <a href="www.google.com">www.google.com</a> should result in bla bla blah <a href="http://www...

Javascript regex pattern

Hi, I have a combo box and input box. If I enter any letter in input box then all the words or sentence that match with that letter should display in the combo box assuming that a list of words or sentence contain in the list. ex1:) input box: a combo box : America Australia joy is in Austr...

Why don't my variables interpolate correctly into my Perl substitution pattern?

I'm writing a complex script that takes the XML backup of a Blogger blog and converts it to InDesign Tagged Text to be laid out in a book. I'm using a whole bunch of regular expressions to clean out the HTML tags of each blog post and convert them to InDesign tags. For example: <p>A really long paragraph.</p> -> <ParaStyle:Main text>A r...

Javascript replace() with case-change

Is there an easy way to change the case of a matched string with javascript? Example String : <li>something</li> Regex : /<([\w]+)[^>]*>.*?<\/\1>/ And what I'd like to do is replace the match $1 to all capital letters (inside the replace if possible). I'm not entirely sure when $1 is a valid match and not a string -- '$1'.toUpperCas...

How can I search and replace in XML with Python?

I am in the middle of making a script for doing translation of xml documents. It's actually pretty cool, the idea is (and it is working) to take an xml file (or a folder of xml files) and open it, parse the xml, get whatever is in between some tags and using the google translate api's translate it and replace the content of the xml files...

Regex Extract html Body

How would I use Regex to extract the body from a html doc, taking into account that the html and body tags might be in uppercase, lowercase or might not exist? ...

regex to replace "foo-some white space-bar" with "fubar"

I'm a newcomer to regular expressions, and am having a hard time with what appears to be a simple case. I need to replace "foo bar" with "fubar", where there is any amount and variety of white space between foo and bar. For what it's worth, I'm using php's eregi_replace() to accomplish this. Thanks in advance for the help. ...

How to: PHP dynamic url checking

I am making an admin panel for a small project. I want to use dynamic URLs to edit specific data entries. For instance: file.php?edit&n=53 I want this URL to edit entry 53. I use a switch statement to check for the edit page, but how do I check whether or not the URL has the &n=x extension in the same switch statement? Ex: switch $...

How to read this regular expression pattern?

[^\x20-\x7E] I saw this pattern used for a regular expression in which the goal was to remove non-ascii characters from a string. What does it mean? ...

preg_match_all JS equivalent?

Is there an equivalent of PHP's preg_match_all in Javascript? If not, what would be the best way to get all matches of a regular expression into an array? I'm willing to use any JS library to make it easier. ...

Awk/etc.: Extract Matches from File

I have an HTML file and would like to extract the text between <li> and </li> tags. There are of course a million ways to do this, but I figured it would be useful to get more into the habit of doing this in simple shell commands: awk '/<li[^>]+><a[^>]+>([^>]+)<\/a>/m' cities.html The problem is, this prints everything whereas I simpl...

find & replace part of text

I am trying to do a search and replace using GREP/Regex Here is what I am searching for <div align="center" class="orange-arial-11"><b>.+<br> I want to remove the <b>, <br> tags, and place <h3> tags around what .+ finds. But I can't get what .+ finds to stay when it does the replace. For example, I want to find this <div align="ce...