questions about regex | ansaurus

regex

How can I optimize this or is there a better way to do it?(HTML Syntax Highlighter)

Hello every one, I have made a HTML syntax highlighter in C# and it works great, but there's one problem. First off It runs pretty fast because it syntax highlights line by line, but when I paste more than one line of code or open a file I have to highlight the whole file which can take up to a minute for a file with only 150 lines of co...

syntax-highlighting

Basic Profanity Filter in Objective C for iPhone

How have you like minded individuals tackled the basic challenge of filtering profanity, obviously one can't possibly tackle every scenario but it would be nice to have one at the most basic level as a first line of defense. In Obj-c I've got NSString *tokens = [text componentsSeparatedByString:@" "]; And then I loop through each to...

What is the efficient way to find some pattern in a big text?

I want to extract email addresses from a large text file. what is the best way to do it? My idea is to find '@' in the text and use "Regex" to find email address into substring at (for example) 256 chars before this position and length of 512. P.S.: Straightforwardly I want to know the best and most efficient way to find some pattern (...

split with javascript

Hi below is something I am trying to do with JavaScript. If I have string like str = "how are you? hope you are doing good" ; now I want to split it with ? (or . or !) but I dont want to lose the "?". Instead I want to break the string just after the question mark such a way that question mark is with the first segment that we hav...

php's preg_match returning different number of matches for same pattern

I'm trying out preg_match with a roman numeral to integer converter. The problem is, for certain inputs, preg_replace seems to be giving too few matches. The code: function romanNumeralToInt($romanNumeral) { preg_match ( '/^(M?M?M?)' .'((CM)|(CD)|((D?)(C?C?C?)))' .'((XC)|(XL)|((L?)(X?X?X?)))' .'((IX)|(IV)...

How can I remove certain characters from inside angle-brackets, leaving the characters outside alone?

Edit: To be clear, please understand that I am not using Regex to parse the html, that's crazy talk! I'm simply wanting to clean up a messy string of html so it will parse Edit #2: I should also point out that the control character I'm using is a special unicode character - it's not something that would ever be used in a proper tag unde...

string-manipulation

Regex Problem(C#)

HTML: <TD style="DISPLAY: none">999999999</TD> <TD class=CLS1 >Name</TD> <TD class=BLACA>271229</TD> <TD>220</TD> <TD>343,23</TD> <TD>23,0</TD> <TD>222,00</TD> <TD>33222,8</TD> <TD class=blacl>0</TD> <TD class=black>0</TD> <TD>3433</TD> <TD>40</TD> I need td in value. How to do it in C#? I want a string array; 999999999 Name 271229 2...

preg_match_all problems

i use preg_match_all and need to grab all a href="" tags in my code, but i not relly understand how to its work. i have this reg. exp. ( /(<([\w]+)[^>]>)(.?)(<\/\2>)/ ) its take all html codes, i need only all a href tags. i hobe i can get help :) ...

Getting link using specific text

Hi, From HTML body,I have to extract link which has text "Customer".For example <a href="google.com?x=1&xy=2" _target="blank" title="cus">Customer </a> I was thinking of using regex. What regex to use? ...

javascript split string on space or on quotes to array

var str = 'single words "fixed string of words"'; var astr = str.split(" "); // need fix i want the array to be like: single, words, fixed string of words. ...

preg_match to find the current directory in a URL

I'm trying to detect the current section of a site that a user is viewing by checking for the final directory in the URL. I'm using a PHP and regex to do it and I think I'm close but unfortunately not quite there yet. Here's what I currently have: <?php $url = $_SERVER['REQUEST_URI_PATH'] = preg_replace('/\\?.*/', '', $_SERVER['REQ...

PHP Filter from string

Hi, I have a string in PHP for example $string = "Blabla [word]"; I would like to filter the word between the '[' brackets. The result should be like this $substring = "word"; thanks ...

A regular expression question

Hello, I am in dire need of a such regular expression where my alphabet is made up of 0s and 1s. Now I need a language that accepts all words as long as it has three 0s. IE: 000 10001 0001 1000 10000101 ...

regular expression to remove original message from reply mail using in java ?

In my forum, users can reply through email. I am handling mails from their reply. When they are replying the original message getting appended. I want to get only the reply message not the original message. I have to write regular expression for gmail & hotmail. I written regex for gmail as follows : \n.*wrote:(?s).*--End of Post-- ...

Regular expression (PCRE) for URL matching

The input: we get some plain text as input string and we have to highlighight all URLs there with <a href={url}>{url></a>. For some time I've used regex taken from http://flanders.co.nz/2009/11/08/a-good-url-regular-expression-repost/, which I modified several times, but it's built for another issue - to check whether the whole input st...

Match any word which contains a particular string of characters

I apologise in advance for the poor title of this post. I'm trying to match any word which contains a certain string of characters i.e. if I wanted to match any words which contained the string 'press' then I would want the following returned from my search, press expression depression pressure So far I have this /press\w+/ which m...

How do I extract HTML content using Regex in PHP

I know, i know... regex is not the best way to extract HTML text. But I need to extract article text from a lot of pages, I can store regexes in the database for each website. I'm not sure how XML parsers would work with multiple websites. You'd need a separate function for each website. In any case, I don't know much about regexes, so ...

html-content-extraction

JavaScript Regex Problem

Hi all; Csharp Regex Pattern: Regex rg = new Regex("(?i)(?<=>)[^<]+(?=</TD>)"); JavaScript Regex Pattern: var pattern = (?i)(?<=>)[^<]+(?=</TD>); var result = str.match(pattern); Csharp Regex pattern work, but javascript regex pattern not work pls help ? ...

Java split is eating my characters.

Hi, I have a string like this String str = "la$le\\$li$lo". I want to split it to get the following output "la","le\\$li","lo". The \$ is a $ escaped so it should be left in the output. But when I do str.split("[^\\\\]\\$") y get "l","le\\$l","lo". From what I get my regex is matching a$ and i$ and removing then. Any idea of how to g...

string-manipulation

How do I find multiple matches with one regular expression?

I've got the following string: response: id="1" message="whatever" attribute="none" world="hello" The order of the attributes is random. There might be any number of other attributes. Is there a way to get the id, message and world attribute in one regular expression instead of applying the following three one after another? / messa...

1
...
381
382
383
384
385
...
613