regex

How to get regex to match multiple script tags?

I'm trying to return the contents of any tags in a body of text. I'm currently using the following expression, but it only captures the contents of the first tag and ignores any others after that. Here's a sample of the html: <script type="text/javascript"> alert('1'); </script> <div>Test</div> <script type="text/javascript"...

Parsing CSV input with a RegEx in java

I know, now I have two problems. But I'm having fun! I started with this advice not to try and split, but instead to match on what is an acceptable field, and expanded from there to this expression. final Pattern pattern = Pattern.compile("\"([^\"]*)\"|(?<=,|^)([^,]*)(?=,|$)"); The expression looks like this without the annoying esc...

Ruby linkify for urls in strings

There have been a few posts about linkifying text using a regex. The most popular is this post. However my spec is a little more tricky: describe TextFormatter do def l(input) TextFormatter.gsub_links!(input){|link| "!!#{link}!!"} end it "should detect simple links" do l("http://www.cnn.com").should == "!!http://www....

Need Help to make this regex for my whiteList

Hi I am using a rich html editor and I want to make a whitelist of the stuff that should be allowed in. I heard that you should use a whitelist instead of black list since it is easier to do then trying to then making a blacklist. I even seen some examples where people could hide the script tag in a css style part. So this is a sampl...

Will this remove all possible script tags?

Hi I am trying to make a regex that will just look for and remove script tags(its the only tag I wanted removed since I think it is the only one that can cause damage). Anyways I know there are so many way to write a script tag that is still valid. Will this catch them? <\s*script\s*>.*?<\s*\/script\s*> Edit or would it better to t...

Regex for matching a character, but not when it's enclosed in quotes

Hi I need to match a colon (':') in a string, but not when it's enclosed by quotes - either a " or ' character. So the following should have 2 matches something:'firstValue':'secondValue' something:"firstValue":'secondValue' but this should only have 1 match something:'no:match' ...

help with python regular expression

I am wondering if problem down here can be solved with one regular expression or I should make standard loop and evaluate line by line, when I run included code I get ['Ethernet0/22', 'Ethernet0/24'], only result should be ['Ethernet0/23', 'Ethernet0/25']. any advice on this? import re txt='''# interface Ethernet0/22 stp disabl...

Can't filter filenames in Talend Open Studio using regular expressions

Hi to all! This question is about Talend Open Studio code. I use tSendmail component as a child job, that needs to be run when parent job fails (tFtpPut). However, in tFtpPut, file names are filtered by filename masks (for example, it will upload file named Eedoh, if I put Ee* as a mask), but in tSendMail that's not the case. I unders...

How can I extract substrings from a string in Perl?

Hi, Consider the following strings: 1) Scheme ID: abc-456-hu5t10 (High priority) * 2) Scheme ID: frt-78f-hj542w (Balanced) 3) Scheme ID: 23f-f974-nm54w (super formula run) * and so on in the above format - the parts in bold are changes across the strings. ==> Imagine I've many strings of format Shown above. I want to pick 3 su...

In JavaScript, how can I replace text in an HTML page without affecting the tags?

I'm trying to figure out how to do a replace with Javascript. I'm looking at the entire body of the page and would like to replace the keyword matches NOT within an HTML tag. Here is an example: <body> <span id="keyword">blah</span> <div> blah blah keyword blah<br /> whatever keyword whatever </div> </body> <script type...

Regular expression to match all characters on a U.S. keyboard

I'm looking for a regex pattern to match all characters that are found on a U.S. keyboard. right now, I match only on letters and numbers and white space, so it looks like ^[a-zA-Z0-9\\s]+$ But now I need it to match on any character found on a keyboard. I even want it to match if the string is empty as well. ...

Regex.Replace doesn't seem to work with back-reference

I made an application designed to prepare files for translation using lists of regexes. It runs each regex on the file using Regex.Replace. There is also an inspector module which allows the user to see the matches for each regex on the list. It works well, except when a regex contains a back-reference, Regex.Replace does not replace...

C# Regex: returning a collection of results

string: "<something><or><other>" regex pattern: "<(\w+)><(\w+)><(\w+)>" How do I make a regex call that returns to me a collection of results containing everything between the parentheses? For example, I would want a result set of {"something", "or", "other"}. For bonus points, what is this called? Captures? Capturing groups? Some...

Get content between two strings PHP

Whats is the best way to obtain the content between two strings e.g. ob_start(); include('externalfile.html'); ## see below $out = ob_get_contents(); ob_end_clean(); preg_match('/{FINDME}(.|\n*)+{\/FINDME}/',$out,$matches); $match = $matches[0]; echo $match; ## I have used .|\n* as it needs to check for new lines. Is this correct? #...

Regular expression for String.Format-like utility

Hi, I'm writing a class called StringTemplate, which allows to format objects like with String.Format, but with names instead of indexes for placeholders. Here's an example : string s = StringTemplate.Format("Hello {Name}. Today is {Date:D}, and it is {Date:T}.", new { Name = "World", Date = DateTime.No...

regex encapsulation php

What are the differences pros/cons of using either '/' or '#' as the regex encapsulation e.g. '/' = preg_match('/MYSEARCH}(.+)ENDMYSEARCH/s',$out,$matches); '#' = preg_match('#MYSEARCH}(.+)ENDMYSEARCH#s',$out,$matches); Thanks! ...

Remove colon using VI

Hi, I am trying do a find and replace in VI to remove a timestamp. I usually do this in VI using the S command but how do I tell VI I need to remove colons when its part of the structure of the VI command itself EX: " xxxxx xxxxx 24:00:00 CDT" tried s:24:00:00 CDT::g s:"24:00:00 CDT"::g s:/:::g Any assistance is appre...

Regex doesn't work with multi line

$regpattern4 = "!<media:description type='plain'> (.*) <\/media:description>!"; I am parsing an XML document. The above Regex works if there are no line breaks in the description, but how do I make it work even if there are line breaks? ...

What is the best method for testing URLs against a blacklist in PHP

I have a script that is scraping URLs from various sources, resulting in a rather large list. Currently I've just got a collection of if statements that I'm using to filter out sites I don't want. This obviously isn't maintainable, so I'm trying to find a fast and powerful solution for filtering against a blacklist of url masks. The be...

Clojure Parse String

I have the following string layout: default title: Envy Labs What i am trying to do is create map from it layout->default title->"envy labs" Is this possible to do using sequence functions or do i have to loop through each line? Trying to get a regex to work with and failing using. (apply hash-map (re-split #": " meta-info))...