regex

regex to match a maximum of 4 spaces

Hi, I have a regular expression to match a persons name. So far I have ^([a-zA-Z\'\s]+)$ but id like to add a check to allow for a maximum of 4 spaces. How do I amend it to do this? Edit: what i meant was 4 spaces anywhere in the string ...

How to use regex for utf8 in ruby

In RoR,how to validate a Chinese or a Japanese word for a posting form with utf8 code. In GBK code, it uses [\u4e00-\u9fa5]+ to validate Chinese words. In Php, it uses /^[\x{4e00}-\x{9fa5}]+$/u for utf-8 pages. ...

How do I get a particular word from a string in PHP?

Say you have a string, but you don't know what it contains. And you want to replace all occurences of a particular word or part of a word with a formatted version of the same word. For example, I have a string that contains "lorem ipsum" and i want to replace the entire word that contains "lo" with "lorem can" so that the end result woul...

Regex - is there something I've done wrong?

This is javascript, but a virtually identical regex is failing in PHP too, so I don't think it's language specific var r = new RegExp( "^(:19|20)?[0-9][0-9]" // optional 19/20 start followed by 2 numbers + "-" // a hyphen + "(:0?[1-9]|1[0-2])" // optional 0 followed by 1-9, ...

How to parse a search term into the various question types?

I am writing an internal application where we let users to do a few different types of queries. The application allows users to search a database by either of the following keys: employeeId name (first or last) companyId status (workingFullTime, sickLeave, maternityLeave, etc) The brute force way is to simply make one webform for e...

Profanity Filtering / Profanity Dictionaries / Scunthorpe Problem / Profanity Generation

Here's a clbuttic question. I collect code that attempts to do profanity filtering. I personally like profanity, and whenever possible try to talk everyone out of using profanity filters. The filters always run into the very embarrassing Scunthorpe Problem which tends to make things worse. Of course there are sites that legitemately need...

Hashtable/dictionary/map lookup with regular expressions

I'm trying to figure out if there's a reasonably efficient way to perform a lookup in a dictionary (or a hash, or a map, or whatever your favorite language calls it) where the keys are regular expressions and strings are looked up against the set of keys. For example (in Python syntax): >>> regex_dict = { re.compile(r'foo.') : 12, re.c...

Get a method's contents from a cs file

I have a requirement to get the contents of every method in a cs file into a string. What I am looking for is when you have an input of a cs file, a dictionary is returned with the method name as the key and the method body as the value. I have tried Regex and reflection with no success, can anyone help? Thanks ...

Matching domains with regex for lighttpd mod_evhost (www.domain.com / domain.com / sub.domain.com)

Hi, I'm playing about with lighttpd on a small virtual private server. I two domains pointing to the server. I am using the latest version of lighttpd and mod_evhost on Ubuntu 8.10. I'm trying to set up a rule such that if anyone requests domain.com or www.domain.com they get served from /webroot/domain.com/www/ Similarly, if anyone ...

Can I use dashes in Named Captures with .NET's System.Text.RegularExpressions?

Is it possible to do something like (?'A-B'\s*) ? ...

what's the quickest way to extract a 5 digit number from a string in c#

what's the quickest way to extract a 5 digit number from a string in c#. I've got string.Join(null, System.Text.RegularExpressions.Regex.Split(expression, "[^\\d]")); Any others? ...

Why does this RegEx work the way I want it to?

I have a RegEx that is working for me but I don't know WHY it is working for me. I'll explain. RegEx: \s*<in.*="(<?.*?>)"\s*/>\s* Text it finds (it finds the white-space before and after the input tag): <td class="style9"> <input name="guarantor4" id="guarantor4" size="50" type="text" tabindex="10" value="<?php echo $data[guar...

Why are regular expressions such a complicated, cryptic mess?

Often when I see regular expressions, I only see a total mess of characters. Why does it have to be this way? I guess what I really want to know is: are there alternatives to regular expressions that basically do the same thing but are implemented in a human readable language? [UPDATE] Thanks for all the great responses and inspiratio...

Regex Question - One or more spaces outside of a quote enclosed block of text

I want to be replace any occurrence of more than one space with a single space, but take no action in text between quotes. Is there any way of doing this with a Java regex? If so, can you please attempt it or give me a hint? ...

Why do people defend the regex syntax?

There is a similar question going around, but it just got the same old answers that people always give about Regex syntax, but that's not the point here, so please try to not knee jerk the same old answers about regex syntax. Try to be a little more original and personal about it this time. Regex syntax is very VERY compact, almost too ...

How to find two adjacent repeating digits and replace them with a single digit in Java?

I need to find two adjacent repeating digits in a string and replace with a single one. How to do this in Java. Some examples: 123345 should be 12345 77433211 should be 74321 ...

Randomizing RegEx In PHP

Basically I want to use RegEx to grab stuff in between paragraphs in a document. I think the expression would be: <p>.+?</p> Say it grabs 10 items using this RegEx, I then want PHP to randomly choose one of those and then save it to a variable. Any ideas? ...

Regex Grammar

Is there any BNF grammar for regular expression? ...

rel-tag bookmarklet for last path component of a URL

Many web sites support folksonomy tags. You may have heard of rel-tag, where it says that "The last path component of the URL is the text of the tag". I am looking for a bookmarklet or greasemonkey script (javascript) to get the "last path component" for the URL currently being viewed in the browser, add that tag into another URL, and ...

How do I process a string such as this using regular expressions?

How can I create a regex for a string such as this: <SERVER> <SERVERKEY> <COMMAND> <FOLDERPATH> <RETENTION> <TRANSFERMODE> <OUTPUTPATH> <LOGTO> <OPTIONAL-MAXSIZE> <OPTIONAL-OFFSET> Most of these fields are just simple words, but some of them can be paths, such as FOLDERPATH, OUTPUTPATH, these paths can also be paths with a filename an...