questions about regex | ansaurus

regex

Converting Date and Time To Unix Timestamp

I'm displaying the date and time like this 24-Nov-2009 17:57:35 I'd like to convert it to a unix timestamp so I can manipulate it easily. I'd need to use regex to match each part of the string then work out the unix timestamp from that. I'm awful with regex but I came up with this. Please suggest improvements ^.^ /((\d){2}+)-((Ja...

Regular Expression, Back reference or alternate construct...

I am trying to write a RegEx in .Net to capture the whole function from a list of function that look something like this. public string Test1() { string result = null; foreach(var item in Entity.EntityProperties) { result +=string.Format("inner string with bracket{0}", "test"); } return result; } public string Test5() { return str...

lazy-evaluation

Explode string by one or more spaces or tabs

How can I explode a string by one or more spaces or tabs? Example: A B C D I want to make this an array. ...

Simple regex question: How to find the first parentheses and get the content inside it?

I'm new to regex with PHP. How can I find the contents of the first parentheses with a string that is a couple of paragraphs long. I'm assuming I have to use the preg_match function ...

Python HTML scraping

Hey, It's not really scraping, I'm just trying to find the URLs in a web page where the class has a specific value. For example: <a class="myClass" href="/url/7df028f508c4685ddf65987a0bd6f22e"> I want to get the href value. Any ideas on how to do this? Maybe regex? Could you post some example code? I'm guessing html scraping libs, su...

screen-scraping

html-content-extraction

content URLs regexp

I receive a block of code from db which occasionally contains urls, e.g, http://site.tld/lorem.ipsum/whatever Now I want to turn this into nice clickable link for the user, with a helper method. Such as: <a href="http://site.tld/lorem.ipsum/whatever">http://site.tld/lorem.ipsum/whatever</a> Of course, anyone can do this, [^\s...

Regex: How to retrieve all lines containing strA but not strB in Visual Studio

How can I retrieve all lines of a document containing "strA", but not "strB", in the Visual Studio search box? ...

Regex question, how to update this C# RegexMatches routine to update/replace the items found?

Hi, Can I ask for a pointer re C# and Regex. I've got the routine that works ok below that finds links within CSS. If I wanted to rewrite the links I find as I go through, and then have a copy of a string at the end of this that represents the initial CSS text but with the rewritten links in place how would I do this? var resultL...

Regular Expression to match a string only when certain characters don't exist

So, here's my question: I have a crawler that goes and downloads web pages and strips those of URLs (for future crawling). My crawler operates from a whitelist of URLs which are specified in regular expressions, so they're along the lines of: (http://www.example.com/subdirectory/)(.*?) ...which would allow URLs that followed the patt...

Best practices for regex performance VS sheer iteration

I was wondering if there are any general guidelines for when to use regex VS "string".contains("anotherString") and/or other String API calls? While above given decision for .contains() is trivial (why bother with regex if you can do this in a single call), real life brings more complex choices to make. For example, is it better to do t...

How can I do a "does not contain" operation in regex?

This is my string: <br/><span style=\'background:yellow\'>Some data</span>,<span style=\'background:yellow\'>More data</span><br/>(more data)<br/>'; I want to produce this output: Some data,More data Right now, I do this in PHP to filter out the data: $rePlaats = "#<br/>([^<]*)<br/>[^<]*<br/>';#"; $aPlaats = array(); preg_match...

Regex Problem (newbie)

hi all, i'm writing a little app for spam-checking and i'm having problems with a regex. let's say i'm having this spam-url: http://hosting.tyumen.ru/tip.html so i want to check its url for having 2 full stops (subdomain+ending), a slash, a word, full stop and "html". here's what i got so far: <a href="(http://.*?\..*?..*?/.*?.htm...

Javascript Regex Match the first the occurrence

Hi, I've this regex (which doesn't do what i want): /^.*\/(eu|es)(?:\/)?([^#]*).*/ which actually is the js version of: /^.*/(eu|es)(?:/)?([^#]*).*/ Well, it doesn't do what i want, of course it works. :) Given this URLs: http://localhost/es -> [1] = es, [2] = '' http://localhost/eu/bla/bla#wop -> [1] = eu, [2] = 'bla/bla' http://loc...

Regex word-breaker in unicode

How do I convert the regular expression \w+ To give me the whole words in Unicode – not just ASCII? I use .net ...

Problem with regular expression!

Hi, I have used the following pattern for the regular expression for the phone number pattern="[0-9 -+]+$"; The phone number may contain numbers, hyphen(-), space and plus(+). It works when i use numbers only. When numbers and alphabets are used it does not work. What can be the problem, please do let me know. Thanks in advanc...

Removing non-alphaNumerics in MySQL

Hi! Do you know any easy way to remove (or replace) all non alphanumeric characters from varchar variable in Mysql? something like String's replaceAll("[^a-zA-Z0-9]", "I") in Java ('I' is my special character but "" would be also good ) ...

Regular expression to match a pattern either at the beginning of the line or after a space character

I've been trying to dry up the following regexp that matches hashtags in a string with no success: /^#(\w+)|\s#(\w+)/i This won't work: /^|\s#(\w+)/i And no, I don't want to comma the alternation at the beginning: /(^|\s)#(\w+)/i I'm doing this in Ruby - though that should not be relevant I suppose. To give some examples of mat...

Get a block of text in a list of blocks using Regular Expressions

Edit2: only regex match solutions, please. thank you! Edit: I'm looking for regex solution, if it's exist. I have other blocks with the same data that are not XML, and I can't use Perl, I added Perl tag as I'm more familiar with regexes in Perl. Thanks in advance! I Have list like this: <Param name="Application #" value="1"> <Param ...

c# regex: get all uri's, but not from a specific domain

Hi, i have a c# regex, which gives me all uri's in a document. it's this: <a[^>]*\shref=[\""\'][^>]*" this one works, but i want to exclude al uri's (matches) which have the word 'doubleclick.net' in it, because those uri's i want to leave untouched, and the others i want to add some code to. i've tried this: ((?!doubleclick.net...

using regex to get extract username from email address

My string of text looks like this [email protected] (John Doe) I need to get just the part before the @ and nothing else. The text is coming from a simple xml object if that matters any. The code i ahve looks like this $authorpre = $key->{"author"}; $re1='((?:[a-z][a-z]+))'; if ($c=preg_match_all ("/".$re1."/is", $authorpre, $matc...

1
...
223
224
225
226
227
...
613