I have a lot of HTML files which have unwanted line-feeds. These break things like inline javascript and formatting within the pages. I want to come up with a way to strip out all line feeds from the pages that do not appear directly after an html tag e.g </div>. Does anyone know of a regex and/or program that may be able to acheive this...
What is the proper way of inserting a pipe into a Java Pattern expression?
I actually want to use a pipe as a delimiter and not the or operator.
I.E:
"hello|world".split("|"); --> {"hello", "world"}
...
I am looking for a way to remove 'stray' carriage returns occurring at the beginning or end of a file. ie:
\r\n <-- remove this guy
some stuff to say \r\n
some more stuff to say \r\n
\r\n <-- remove this guy
How would you match \r\n followed by 'nothing' or preceded by 'nothing'?
...
Hello,
I have some data similar to this:
aaa1|aaa2|ZZZ|aaa3|aaa4|aaa5|ZZZ|aaa6|aaa7
I want to match all "aaa[0-9]" BETWEEN "ZZZ" (not the ones outside).
So I have some PHP code:
$string = "aaa1aaa2zzzzaaa3aaa4aaa5zzzzaaa6aaa7";
preg_match_all("/zzzz.*(aaa[0-9]).*zzzz/", $string, $matches, PREG_SET_ORDER);
print_r($m...
How replace (use regex in PHP5) invalid characters in utf-8 string on white space characters?
...
I'm working on some homework for my compiler class and I have the following problem:
Write a regular expression for all strings of a's and b's that contain an odd number of a's or an odd number of b's (or both).
After a lot of whiteboard work I came up with the following solution:
(aa|bb)* (ab|ba|a|b) ((aa|bb)* (ab|ba) (aa|bb)* (ab|ba...
So ereg won't be present in PHP6. And I don't really care, because I'm using PCRE functions. But for multibyte strings, I'm using mb_ereg_* functions. The question is: they'll be present in PHP6 in the mbstring extension, or I will have to switch to some kind of multibyte PCRE functions?
...
Trying to create a simple text-translator in PHP.
It shoult match something like:
Bla bla {translator id="TEST" language="de"/}
The language can be optional
Blabla <translator id="TEST"/>
Here is the code:
$result = preg_replace_callback(
'#{translator(\s+(?\'attribute\'\w+)="(?\'value\'\w+)")+/}#i',
array($this, 'translateTe...
Hi everyone,
I am trying to incorporate a regular expression i have used in the past in a different manner into some validtation checking through javascript.
The following is my script:
var regOrderNo = new RegExp("\d{6}");
var order_no = $("input[name='txtordernumber']").val();
alert(regOrderNo.test(order_no));
Why woul...
I am interested in parsing regexes (not to be confused with using regexes for parsing). Is there a BNF for Java 1.6 regexes (or other languages?)
[NOTE: There is a similar older question which did not lead to an answer for Java.]
EDIT To explain why I need to do this. We are implementing a shallow parser for Natural language processing...
Hello again,
I have a similar issue as my recent post but with a zip code validator i am trying to convert over to a javascript validation process. my script looks like so:
var regPostalCode = new RegExp("\\d{5}(-\d{4})?");
var postal_code = $("input[name='txtzipcode']").val();
if (regPostalCode.test(postal_code) == false)...
I want to validate a string only if it contains '0-9' chars with length between 7 and 9.
What I have is [0-9]{7,9} but this matches a string of ten chars too, which I don't want.
Thanks.
...
On my OS X 10.5.8 machine, using the regcomp and regexec C functions to match the extended regex "(()|abc)xyz", I find a match for the string "abcxyz" but only from offset 3 to offset 6. My expectation was that the entire string would be matched and that I would see a submatch for the initial "abc" part of the string.
When I try the sa...
So here is the string that im scraping a page to read (using file get contents)
<th>Kills (K)</th><td><strong>4,751</strong></td><td><strong>0</strong></td>
How can i navigate to the above section of the page contents, and then isolate the 4,751 inside the above html and load it into $kills ?
Difficulty: the number will change and ha...
I'm pretty sure regular expressions are the way to go, but my head hurts whenever I try to work out the specific regular expression.
What regular expression do I need to find if a Java String (contains the text "ERROR" or the text "WARNING") AND (contains the text "parsing"), where all matches are case-insensitive?
EDIT: I've presented...
Regular expression to validate a text box where i can enter an integer / float value in asp.net
...
I am trying to match <a> tags within my content and replace then with the link text followed by the url in square brackets for a print-version. The following example works if there is only the "href". If the <a> contains another attribute, it matches too much and doesn't return the desired result. How can I match the URL and the link ...
In Django, I'm trying to write a URLconf and view that can take a theoretically unlimited number of "tags". The reason for this is to retrieve objects that have been tagged with different combinations of tags.
For example, URLs like this are desireable:
/topics/tag1/tag2/tag3
The above URL would retrieve "topics" that have been tagge...
What is a regular expression I can use in Vim to find conflicts in CVS and possibly other version control systems?
...
I've hit a wall. Does anybody know a good text editor that has search and replace like Notepad++ but can also do multi-line regex search and replace? Basically, I am trying to find something that can match a regex like:
search oldlog\(.*\n\s+([\r\n.]*)\);replace newlog\(\1\)
Any ideas?
...