regex

Regular expression to get an attribute from HTML tag

I am looking for a regular expression that can get me src (case insensitive) tag from following HTML snippets in java. <html><img src="kk.gif" alt="text"/></html> <html><img src='kk.gif' alt="text"/></html> <html><img src = "kk.gif" alt="text"/></html> ...

Perl - Printing the next line

Hi, I am a noob Perl user trying to get my work done ASAP so I can go home on time today :) Basically I need to print the next line of blank lines in a text file. The following is what I have so far. It can locate blank lines perfectly fine. Now I just have to print the next line. open (FOUT, '>>result.txt'); die "File is not a...

Optimal Regular Expression: match sets of lines starting with ...

Alright, this one's interesting. I have a solution, but I don't like it. The goal is to be able to find a set of lines that start with 3 periods - not an individual line, mind you, but a collection of all the lines in a row that match. For example, here's some matches (each match is separated by a blank line): ... ...hello ... ...hel...

How to correctly parse a mixed latin/ideographic full text query with regex?

I'm trying to sanitize/format some input using regex for a mixed latin/ideographic(chinese/japanse/korean) full text search. I found an old example of someone's attempt at sanitizing a latin/asian language string on a forum of which I cannot find again (full credit to the original author of this code). I am having trouble fully underst...

Is there a token for capture line breaks in multiline regex?

I've run into this problems several times before when trying to do some html scraping with php and the preg* functions. Most of the time I've to capture structures like that: <!-- comment --> <tag1>lorem ipsum</tag> <p>just more text with several html tags in it, sometimes CDATA encapsulated…</p> <!-- /comment --> In particular I wa...

Regular expression to match if all given words are in a string

Say I have a query like this: "one two three", if I replace with spaces with | (pipe character) I can match a string if it contains one or more of those words. This is like a logical OR. Is there something similar that does a logical AND. It should match regardless of word ordering as long as all the words are present in the string. Un...

RegEx for all letters (including Chinese, Greek, etc.)

I need a regex that also matches Chinese, Greek, Russian, ... letters. What I basically want to do is remove punctuation and numbers. Until now I removed punctuation and numbers "manually" but that does not seem to be very consistent. Another thing I have tried is /[\p{L}]/ but that is not supported by Mozilla (I use this in a Fi...

How to replace http:// or www with <a href.. in PHP

I've created this regex (www|http://)[^ ]+ that match every http://... or www.... but I dont know how to make preg_replace that would work, I've tried preg_replace('/((www|http://)[^ ]+)/', '<a href="\1">\1</a>', $str); but it doesn't work, the result is empty string. ...

Matching Unicode control characters except for three with Regular Expressions

Hi, I would need to get a Regular Expression, which matches all Unicode control characters except for carriage return (0x0d), line feed (0x0a) and tabulator (0x09). Currently, my Regular Expression looks like this: /\p{C}/u I just need to define these three exceptions now. ...

grep vs Perl. Should I mod this to work, or should I just replace it with Perl code?

I am writing something that will allow users to search through a log of theirs. Currently I have this, where $in{'SEARCH'} is the string they are searching. open(COMMAND, "grep \"$in{'SEARCH'}\" /home/$palace/palace/logs/$logfile | tail -n $NumLines |"); $f = <COMMAND>; if ($f) { print $Title; print "<div id=log>\n"; ...

lighttpd url rewrite to subdomain

How does the lighttpd rewrite work for folowing? http://example.com/file_46634643.jpg to http://sub.domain.com/46634643.jpg If it's possible... ...

Extract substring between two tokens. Second token could be missing

Good day, I need to extract portion of string which can looks like this: "some_text MarkerA some_text_to_extract MarkerB some_text" "some_text MarkerA some_text_to_extract" I need to extract some_text_to_extract in both cases. MarkerA, MarkerB - predefined text strings. I tried this regexps, but with no luck: ".*\sMarkerA(.*)Marke...

Help in PHP Regx

Using PHP Regx, I need to make the user input arabic,english,digits and the following characters(_ and - and space) user can input string as the following: 10-abc 10-من 10-abcمن _abcمن-10 and so on. Advice me please. ...

Building a regex engine -- online resources?

I'm interested in building a regex engine, as a side-project, just for learning purposes. I know the theory behind evaluation of regular expressions, and have a sufficient understanding of finite state machines etc. What I'm interested in is how a regex engine is implemented in software. So I was wondering if there was any sort of tuto...

RegExp to strip HTML comments

Looking for a regexp sequence of matches and replaces (preferably php but doesn't matter) to change this (the start and end is just random text that needs to be preserved) IN: fkdshfks khh fdsfsk <!--g1--><div class='codetop'>CODE: AutoIt</div><div class='geshimain'><!--eg1--><div class="autoit" style="font-family:monospace;"><span cla...

PHP and RegEx: Split a string by commas that are not inside brackets (and also nested brackets)

Two days ago I started working on a code parser and I'm stuck. How can I split a string by commas that are not inside brackets, let me show you what I mean: I have this string to parse: one, two, three, (four, (five, six), (ten)), seven I would like to get this result: array( "one"; "two"; "three"; "(four, (five, six), (ten...

Is it possible to treat macro's arguments as regular expressions?

Suppose I have a C++ macro CATCH to replace the catch statement and that macro receive as parameter a variable-declaration regular expression, like <type_name> [*] <var_name> or something like that. Is there a way to recognize those "fields" and use them in the macro definition? For instance: #define CATCH(var_declaration) <var_type> <...

Regular expressions in C: examples?

Hello. I'm after some simple examples and best practices of how to use regular expressions in ANSI C. Man regex.h does not provide that much help. Thank you. ...

Why does the JavaScript RegExp /^\w+$/ match undefined?

Why does the the RegExp /^\w+$/ match undefined? Example code: alert(/^\w+$/.test(undefined)); This will display true in Firefox 3 (only browser I tested it on). ...

what is regular expression?

I know this question seems stupid, but it isn't. I mean what is it exactly. I have a fair understanding of the parsing problem. I know BNF/EBNF, I've written grammar to parse simple context-free languages in one of my college courses. I just never met regular expressions before! The only thing that I remember about it is that context-fre...