The list of valid XML characters is well known, as defined by the spec it's:
#x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
My question is whether or not it's possible to make a PCRE regular expression for this (or its inverse) without actually hard-coding the codepoints, by using Unicode general categories. An...
I'm trying to find URLs in some text, using javascript code. The problem is, the regular expression I'm using uses \w to match letters and digits inside the URL, but it doesn't match non-english characters (in my case - Hebrew letters).
So what can I use instead of \w to match all letters in all languages?
...
How can I split by word boundary in a regex engine that doesn't support it?
python's re can match on \b but doesn't seem to support splitting on it. I seem to recall dealing with other regex engines that had the same limitation.
example input:
"hello, foo"
expected output:
['hello', ', ', 'foo']
actual python output:
>>> re.comp...
I am tired of always trying to guess, if I should escape special characters like '()[]{}|' etc. when using many implementations of regexps.
It is different with, for example, Python, sed, grep, awk, Perl, rename, Apache, find and so on.
Is there any rule set which tells when I should, and when I should not, escape special characters? Do...
So, I have been working on this domain name regular expression. So far, it seems to pick up domain names with SLDs and TLDs (with the optional ccTLD), but there is duplication of the TLD listing. Can this be refactored any further?
params[:domain_name].downcase.strip.match(/^[a-z0-9\-]{2,63}
\.((a[cdefgilmnoqrstuwxz]|aero|arpa)|(b[abdef...
Hello, I´m trying to make a regular expression to match a whitespace and so far I´m doing this:
Powered[\s]*[bB]y.*MyBB
I know it should work because I've tried it with Regex Buddy and it says it does but when I try to run it with Eclipse it marks an error saying it's not a valid escape sequence and it automatically adds 2 ´\´ renderi...
I'm trying to match the parts of a version number (Major.Minor.Build.Revision) with C# regular expressions. However, I'm pretty new to writing Regex and even using Expresso is proving to be a little difficult. Right now, I have this:
(?<Major>\d*)\.(?<Minor>\d*)\.(?<Build>\d*)\.(?<Revision>\d*)
This works, but requires that every part...
Hi,
I've got a project I'm working on converting some legacy perl cgi forms to PHP. A lot of this requires finding / replacing information. In one such case, I have lines like this in the perl script:
<INPUT type="radio" name="trade" value="1" $checked{trade}->{1}>
which needs to read:
<INPUT type="radio" name="trade" value="1" ...
I am working on code generation and ran into a snag with generics. Here is a "simplified" version of what is causing me issues.
Dictionary<string, DateTime> dictionary = new Dictionary<string, DateTime>();
string text = dictionary.GetType().FullName;
MessageBox.Show(text);
With the above code snippet the value for "text" is as follows...
How can I make a pattern match so long as it's not inside of an HTML tag?
Here's my attempt below. Anyone have a better/different approach?
import re
inputstr = 'mary had a <b class="foo"> little loomb</b>'
rx = re.compile('[aob]')
repl = 'x'
outputstr = ''
i = 0
for astr in re.compile(r'(<[^>]*>)').split(inputstr):
i = 1 - i
...
I'm trying to capture the id of an element that will be randomly generated. I can successfully capture the value of my element id like this...
| storeAttribute | //div[1]@id | variableName |
Now my variable will be something like...
divElement-12345
I want to remove 'divElement-' so that the variable I am left with is '12345' so th...
Hi,
Advance New Year Wishes to All.
I have an error log file with the contents in a pattern parameter, result and stderr (stderr can be in multiple lines).
$cat error_log
<parameter>:test_tot_count
<result>:1
<stderr>:Expected "test_tot_count=2" and the actual value is 3
test_tot_count = 3
<parameter>:test_one_count
<result>:0
<stder...
I want to store part of an id, and throw out the rest. For example, I have an html element with an id of 'element-12345'. I want to throw out 'element-' and keep '12345'. How can I accomplish this?
I can capture and echo the value, like this:
| storeAttribute | //pathToMyElement@id | myId |
| echo | ${!-myId-!} | |
When I run the test...
In my ASP.NET application, I want to use regular expressions to change URLs into hyper links in user posts, for example:
http://www.somesite.com/default.aspx
to
<a href="http://www.somesite.com/default.aspx">http://www.somesite.com/default.aspx</a>
This is fairly easy using Regex.Replace(), but the problem I'm having is th...
I have heard of regular expressions and only seen use cases for a few things so I don't think of using them very often. In the past I have done a couple of things and it has taken me hours to do. Later I talk to someone and they say "here is how to do it using a regular expression".
So what are things for which you have used Regular E...
I need to identify what character set my input belongs to.
The goal is to distinguish between Arabic and English words in a mixed input (the input is unicode and is extracted from XML text nodes).
I have noticed class Character.UnicodeBlock : is it related to my problem? How can I get it to work?
Edit:
The Character.UnicodeBlock ...
Something like ".//div[@id='foo\d+]" to capture div tags with id='foo123'.
I'm using .NET, if that matters.
...
I know it is possible to match for the word and using tools options reverse the match. (eg. by grep -v) However I want to know if it is possible using regular expressions to match lines which does not contain a specific word, say hede?
Input:
Hoho
Hihi
Haha
hede
# grep "Regex for do not contain hede" Input
Output:
Hoho
Hihi
Haha
...
I am using a simple regex to replace break tags with newlines:
br_regex = /<br>/;
input_content = input_content.replace(br_regex, "\n");
This only replaces the first instance of a break tag, but I need to replace all. preg_match_all() would do the trick in php, but I'd like to know the javascript equivalent. Thanks!
...
I am working on this yahoo pipe Regex and I found a bug I'm unable to wrap my mind around it.
I have a URL, from which I extract digits, cat them and make a img html tag and embed it. The issue is that, the URL is presented in a non padded way, but the image linked has the zeroes. Therefore, when there is a day or a month with single di...