regex

Ruby String::gsub! pausing unexpectedly

Hello Everyone! I am working on a VERY simple script to clean up a few hundred thousand small XML files. My current method is to iterate through the directory and (for each file) read the file, use String::gsub! to make all my changes (not sure if this is best) and then I write the new contents to the file. My code looks something like ...

Getting text that is on a different line, with ex in Vim

Let's say I have the following text in Vim: file1.txt file2.txt file3.txt renamed1.txt renamed2.txt renamed3.txt I want a transformation as follows: file1.txt renamed1.txt file2.txt renamed2.txt file3.txt renamed3.txt What I have in mind is something like the following: :1,3 s/$/ <the text that is 4 lines below this line> I'm...

All characters that may be bullet points (e.g. "*") or "dash" points

This question is a simple point (pardon the pun): What are all the characters that may, when starting a paragraph, be reasonably interpreted as indicating (in the Anglo-saxon demographic) that the paragraph was meant to be a bullet point or a "dash" point. Here are the ones I would expect, so far: Bullets Asterisk: "*", HTML entity ...

Is it possible to pad integers with zeros using regular expressions?

I have a series of numbers of different lengths (varying from 1 to 6 digits) within some text. I want to equalize the lenghts of all these numbers by padding shorter numbers by zeros. E.g. The following 4 lines - A1:11 A2:112 A3:223333 A4:1333 A5:19333 A6:4 Should become padded integers A1:000011 A2:000112 A3:223333 A4:001333 A5:019...

Grouping in Regex

I'm trying to do a match in regex. It must match a string of characters of with the following formats: Start with a C or H, w/ 6 characters following. (Total 7 characters long) Start with KK and with 8 characters following. (Total 10 characters long) The field is limited to 10 typed characters. I have the following: (((C|H).{6})|(KK....

Boost C++ regex - how to get multiple matches

If I have a simple regex pattern like "ab." and I have a string that has multiple matches like "abc abd". If I do the following... boost::match_flag_type flags = boost::match_default; boost::cmatch mcMatch; boost::regex_search("abc abd", mcMatch, "ab.", flags) Then mcMatch contains just the first "abc" result. How can I get all ...

Parse a custom file in C#

Should I be using RegularExpressions to do this? Possible to structure the results as queryable, IEnumerable, etc. I have a file, I cannot change how it is generated. I wish to create a parser class to extract all the data. Ideally, I would like to then use this class to open the file and have it return a queryable array type structur...

Perl: Multiple global "or"-separated regex conditions in while block leads to an infinite loop?

Hi all, I'm learning Perl and noticed a rather peculiar quirk -- attempting to match one of multiple regex conditions in a while loop results in that loop going on for infinity: #!/usr/bin/perl my $hivar = "this or that"; while ($hivar =~ m/this/ig || $hivar =~ m/that/ig) { print "$&\n"; } The output of this program is: th...

Can a formfield be selected w/mechanize based on the type of the field (eg. TextControl, TextareaControl)?

I'm trying to parse an html form using mechanize. The form itself has an arbitrary number of hidden fields and the field names and id's are randomly generated so I have no obvious way to directly select them. Clearly using a name or id is out, and due to the random number of hidden fields I cannot select them based on the sequence number...

Iterate over specific files in a directory

I need to get all images that begin with "t_" using glob. What pattern should I use to do this? //get any image files that begin with "t_" -- (t_image.jpg) not (image.jpg) $images = glob("" . $dir . "*.jpg"); foreach($images as $image) { echo $image; } ...

deny access to URL- nginx regex

What regex would I use to deny every URL using "?": Ex. domain.com/? and domain.com/?p=1224 location (need regex){ deny all; } ...

Using SED to append a string to an existing line of text?

Hi, I'm trying to use the linux sed command to append a path element to the RewriteBase in a .htaccess file. I have tried it with this arguments: #current RewriteBase is "RewriteBase /mypath" sed -ie 's/RewriteBase\(.*\)/RewriteBase \1\/add/g' .htaccess with this result: addriteBase /mypath So sed overwrites the the beginning of ...

Regex: match all but two dots

Hi, I'm trying to validate a system path. This path is valid if it begins with some dirs and don't contain two dots one another. #valid path /home/user/somedir/somedir_2 #evil path /home/user/../somedir_2/ I know how to check for two dots in a path: \.\. or [\.]{2} But I really want to do something like that: /home/user/<match ev...

Javascript regular expression that ignores a substring

Background: I found similiar S.O. posts on this topic, but I failed to make it work for my scenario. Appologies in advance if this is a dupe. My Intent: Take every English word in a string, and convert it to a html hyperlink. This logic needs to ignore only the following markup: <br/>, <b>, </b> Here's what I have so far. It converts...

How can I normalize/canonize a regular expression pattern?

Hi, I have a complex regular expression I've built with code. I want to normalize it to the simplest (canonical) form that will be an equivalent regular expression but without the extra brackets and so on. I want it to be normalized so I can understand if it's correct and find bugs in it. Here is an example for a regular expression I ...

XSLT 2.0 regex question (opening and closing elements on different matches)

I've simplified the problem somewhat, but I hope I've still captured the essence of my problem. Let's say I have the following simple XML file: <main> outside1 ===BEGIN=== inside1 ====END==== outside2 =BEGIN= inside2 ==END== outside3 </main> Then I can use the following the XSLT 2.0: <?xml version="1.0" encodin...

How do the Regular expression in ROR

Hi i have a following huge string format example format : p=" --0016367d537a47795e0489ecb3c7\nContent-Type: text/plain; charset=ISO-8859-1\n\nok this is tested here\n and again going to test it \n\n\nOn Sat, Jun 26, 2010 at 4:20 PM, kumar \n <[email protected]> wrote:\n\n>" From the above huge string i need only the following cont...

How do I escape an apostrophe in my XPath text query with Perl and Selenium?

I have an XPath query which needs to match some text in a span attribute, as follows: my $perl_query = qq(span[text\(\)='It's a problem']); $sel->click_ok($perl_query); Where the text has no apostrophe there is no problem. I've tried the following instead of 'It's a problem': 'It\'s a problem' 'It&apos\;s a problem' 'It\${apos}s a ...

What is a regular expression that find a line like this: <rect **** />

Hi, I want a regular expression that could be used to find the following lines: <rect width='10px' height ='20px'/> <rect width='20px' height ='22px'/> <circle radius='20px' height ='22px'/> and replace them by the these lines: <rect width='10px' height ='20px'></rect> <rect width='20px' height ='22px'></rect> <circle radius='20px' h...

ASP.NET MVC2 Strange Validation Problem with Regex (DataAnnotations)

I've got my validation wired up through my Service layer, and my Birthdate property looks like this. <DisplayName("birthdate")> _ <RegularExpression("^\d{2}\/\d{2}\/\d{4}$", ErrorMessage:="Birthdate is not in the correct format.")> _ <DisplayFormat(ApplyFormatInEditMode:=True, ConvertEmptyStringToNull:=True, DataFormatString...