regex

Regex to split on punctuation excluding URLs

I'm trying to split a string on its punctuation, but the string may contain URLs (which conveniently has all the typical punctuation marks). I have a basic working knowledge of RegEx, but not enough to help me out here. This is what I was using when I discovered the problem: $text[$i] = preg_split('/[\.\?!\-]+/', $post->text); (this ...

Regular Expression (Java) anomaly - explanation sought

Using Java (1.6) I want to split an input string that has components of a header, then a number of tokens. Tokens conform to this format: a ! char, a space char, then a 2 char token name (from constrained list e.g. C0 or 04) and then 5 digits. I have built a pattern for this, but it fails for one token (CE) unless I remove the require...

How to split a string by commas positioned outside of parenthesis?

Hi, I got a string of such format: "Wilbur Smith (Billy, son of John), Eddie Murphy (John), Elvis Presley, Jane Doe (Jane Doe)" so basicly it's list of actor's names (optionally followed by their role in parenthesis). The role itself can contain comma (actor's name can not, I strongly hope so). My goal is to split this string into ...

PHP Regex Validation

I need to use PHP to validate usernames, and I only want them to use alphanumeric, as well as dash (-), underscore (_) and period (.) Is this right? preg_match("/[A-Za-z0-9_-.]/",$text); ...

Regular expression to limit number of characters to 10

I am trying to write a regex that will only allow lowercase letters and up to 10 characters what I have so far looks like this pattern: /^[a-z]{0,10}+$/ this does not work/compile. I had a working one that would just allow lowercase letters which was this pattern: /^[a-z]+$/ but I need to limit the number of characters to 10. Than...

Regular Expressions: Extract Dates From String

Hopefully this won't be too difficult, but I'm not too skilled in regular expressions. I have a string that contains two dates and would like to extract the two dates into an array or something using JAVASCRIPT. Here's my string: "I am available Thursday, October 28, 2009 through Saturday, November 7, 2009" I would like for the array t...

Regular expressions in stored procedures

Can a regular expression be used inside a stored procedure? If it can, how? Can you give some examples of how to do it? ...

How would you make this into a VIM macro?

So one of the common tasks that I do as a programmer is debugging a live system. And one of the ways that I debug a live system is to capture a verbose log from the console. Typically the log file has around 20 extra lines for every one line I am interested. To minimize my macro script I went about creating a macro that will grab ONLY ...

Python regular expression

How to parse the string " {'result':(Boolean, MessageString)} " using Python regular expressions to get Boolean and the MessageString separated into variables? ...

RegEx on 'href' tags

I'm trying to run a regular expression on some href tags using javascript. Unfortunatly reverse possitive lookup doesn't work in javascript :-( I want to prefix all the relative tags with their full path. I am using this regex-replace patern ([hs]r[ce]f?)[\s]?=[\s\"\']?(?!http|\/)(.*?)[\s\"\'] <Sorry, as I'm unable to post my HTML sa...

a replace question

I have this string aa**b**qqidjwljd**p**fjem I need to replace b by p and p by b aa**p**qqidjwljd**b**fjem the way I do this look like this myvar.replace("b","1").replace("p","b").replace("1","p") this is kind of really ugly is there a better way? edit why ugly? because I have to decide/find an arbitrary set of characte...

Regex PHP, pattern matching

Hi i would like to have a regex in php which matches a word in a string but if the word is a link. The problem is that I replace words with links for example: "text" => < a href = "mylink">text< /a>. But sometimes I have the problem that it is replaced twice. So I want to avoid this problem. My pattern now is /text/i. Eg. This is my...

Why doesn't this JavaScript function remove square brackets one by one?

var asdf = "a[3] > b[5] > c[1]" function removebracket(){ var newstring = asdf.replace(/\/[^\/]*$/, '') alert(newstring); } <a href="#" onClick="javascript:removebracket();"> remove square brackets one by one </a> ...

Easy Q: UnicodeEncodeError: 'ascii' codec can't encode character

Hi, I'm trying to pass big strings of random html through regular expressions and my Python 2.6 script is choking on this: UnicodeEncodeError: 'ascii' codec can't encode character I traced it back to a trademark superscript on the end of this word: Protection™ -- and I expect to encounter others like it in the future. Is there a modu...

Looking for a regex that match all words, except the ones [inside brackets]

I'm trying to write a regular expression that matches all word inside a specific string, but skips words inside brackets. I currently have one regex that matches all words: /[a-z0-9]+(-[a-z0-9]+)*/i I also have a regex that matches all words inside brackets: /\[(.*)\]/i I basically want to match everything that the first regex matc...

^[A-Za-Z ][A-Za-z0-9 ]* Regular Expression?

this expression describe "first letter should be alphabet and remaining letter may be alpha numerical". But i want it should allow special characters also like, when i enter "C#" it is raising error. i want to enter special character also and first letter should alphabet". help me thank you. ...

How can I parse text in Python?

Sample Text: SUBJECT = 'NETHERLANDS MUSIC EPA' CONTENT = 'Michael Buble performs in Amsterdam Canadian singer Michael Buble performs during a concert in Amsterdam, The Netherlands, 30 October 2009. Buble released his new album entitled 'Crazy Love'. EPA/OLAF KRAAK ' Expected result: " NETHERLANDS MUSIC EPA | 36 before Michael Buble p...

Easy Python Q: UnicodeEncodeError: 'ascii' codec can't encode character ???

Hi, I'm trying to pass big strings of random html through regular expressions and my Python 2.6 script is choking on this: UnicodeEncodeError: 'ascii' codec can't encode character I traced it back to a trademark superscript on the end of this word: Protection™ -- I do not need to capture the non-ascii stuff, but it is a nuisance and I...

Find all folders excluding some paths

Hi, In visual studio when searching for files, how can I find all files that do not contain a certain string in their directory path or file name? For example: I want to find all files that have the word MainRegion but I do not want files such as: c:\myfiles\file1Fixture.cs c:\myfiles\somedirectory\a.b.tests\filename.xaml So I wan...

Is there a grep or shell script that uses a regex to change filenames?

How can I recursively change xxx-xxx_[a-zA-Z]+_\d+_(\d+)\.jpg into $1.jpg? ...