regex

Multi-line regex support in Vim

I notice the standard regex syntax for matching across multiple lines is to use /s, like so: This is\nsome text /This.*text/s This works in Perl for instance but doesn't seem to be supported in Vim. Instead, I have to be much more specific: /This[^\r\n]*[\r\n]*text/ I can't find any reason for why this should be, so I'm thinking I ...

Why do regular expressions in Java and Perl act differently?

My understanding is that Java's implementation of regular expressions is based on Perl's. However, in the following example, if I execute the same regex with the same string, Java and Perl return different results. Here's the Java example: public class RegexTest { public static void main( String args[] ) { String sentence ...

Regular Expression engine that supports raw UTF-8?

Hi, I need a regular expression engine that supports raw UTF-8 - meaning, the UTF-8 string is stored in char * as two chars(or one, or less) - for example, Ab is the array {0x41,0x62}. Anyone know of an regex engine that can receive that format? I can convert to wchar_t if needed first. ...

How can split a string which contains only delimiter?

Hi All : I am using the following code: String sample = "::"; String[] splitTime = sample.split(":"); // extra detail omitted System.out.println("Value 1 :"+splitTime[0]); System.out.println("Value 2 :"+splitTime[1]); System.out.println("Value 3 :"+splitTime[2]); I am getting ArrayIndexOutofBound exception. How does String.split()...

Can I Determine the Set of First Chars Matched by Regex Pattern?

I would like to be able to compute the set of all characters which may be matched as the first character in a string by a given instance of java.util.regex.Pattern. More formally, given the DFA equivalent to a certain regular expression, I want the set of all outgoing transitions from the start state. An example: Pattern p = Pattern.c...

Boost Regex - Where are the matching strings stored?

Hi I`m writing a web spider and want to use boost regex library instead of crafting some complicated parsing functions. I took a look at this example: #include <string> #include <map> #include <boost/regex.hpp> // purpose: // takes the contents of a file in the form of a string // and searches for all the C++ class definitions, ...

How to detect exact length in regex

I have two regular expressions that validate the values entered. One that allows any length of Alpha-Numeric value: @"^\s*(?<ALPHA>[A-Z0-9]+)\s*" And the other only allows numerical values: @"^\s*(?<NUM>[0-9]{10})" How can I get a numerical string of the length of 11 not to be catched by the NUM regex. ...

Regex Expression Help

I'm trying to determine the regular expression to verify that a string has at least one alphabetic character. Any help would be appreciated. Thanks. ...

Using C# regular expressions to remove HTML tags

How do I use C# regular expression to replace/remove all HTML tags, including the angle brackets? Can someone please help me with the code? ...

finding substrings in python

hi, Can you please help me to get the substrings between two characters at each occurrence For example to get all the substrings between "Q" and "E" in the given example sequence in all occurrences: ex: QUWESEADFQDFSAEDFS and to find the substring with minimum length. ...

regular expression grabbing X amount of values out in linux bash

Hi guys wonder if you guys could help me I'm trying to compile a bash script that will display some values from a section of html code and I am stuck on the regular expression part, I have the following piece of code <li><div friendid="107647498" class="friendHelperBox"><div><a href="http://www.myspace.com/rockyrobsyn" class="msProf...

Ruby regex to remove newlines from specifc HTML tag?

Hi everyone, Sorry I'm really bad at regexes, I finally hacked osmething to work in ruby. I'd appreciate if someone can instruct the proper way of how to do this: I basically wanted to remove all \n when it appears within ul tags. while body =~ /<ul>.*(\n+).*<\/ul>/m body =~ /<ul>(.+)<\/ul>/m body.gsub!( /<ul>(.+)<\/ul>/m, ...

Parse router configuration output.

My router supports telnet sessions to configure my router, I want to make an application in c# that parses the console output into something useful. edit: the lines are seperated with "\n\r" and there are no \t characters used, everything is spaced out Example: bridge configuration for "bridge" : OBC : dest : Internal ...

How can I match /foo but not /foo/ in Perl's Catalyst?

I want to match /foo, but not /foo/ (where foo can be any string) or / I tried a lot of things along these lines: sub match :Path :Regex('^[a-z]+$') sub match :Regex('^[a-z]+$') sub match :Path :Args(1) But I cannot achieve what I need. I don't believe the problem is with my regex, but because I want to handle a path without an argu...

AS3 RegExp to match words with boundry type characters in them

I'm wanting to match a list of words which is easy enough when those words are truly words. For example /\b (pop|push) \b/gsx when ran against the string pop gave the door a push but it popped back will match the words pop and push but not popped. I need similar functionality for words that contain characters that would normally q...

MySQL query to handle encoded characters using Regular Expressions

My problem involves searching a MySQL table for a list of matching city names given an initial search string with the purpose of handling special characters such , etc that are encoded with an html entity (&ouml;). Example: There is a table called 'cities'. The column for the city name is called 'name'. There are two cities Hamberg (id...

Explanation and solution for JavaCC's warning "Regular expression choice : FOO can never be matched as : BAR"?

I am teaching myself to use JavaCC in a hobby project, and have a simple grammar to write a parser for. Part of the parser includes the following: TOKEN : { < DIGIT : (["0"-"9"]) > } TOKEN : { < INTEGER : (<DIGIT>)+ > } TOKEN : { < INTEGER_PAIR : (<INTEGER>){2} > } TOKEN : { < FLOAT : (<NEGATE>)? <INTEGER> | (<NEGATE>)? <INTEGER> "." <...

Find shortest substring

I have written a code to find the substring from a string. It prints all substrings. But I want a substring that ranges from length 2 to 6 and print the substring of minimum length. Please help me Program: import re p=re.compile('S(.+?)N') s='ASDFANSAAAAAFGNDASMPRKYN' s1=p.findall(s) print s1 output: ['DFA', 'AAAAAFG', 'MPRKY'] D...

matching text in quotes (newbie)

Hi, I'm getting totally lost in shell programming, mainly because every site I use offers different tool to do pattern matching. So my question is what tool to use to do simple pattern matching in piped stream. context: I have named.conf file, and i need all zones names in a simple file for further processing. So I do ~$ cat named.loca...

Need help writing regular expression (html parsing)

Hi I'm trying to write a regular expression for my html parser. I want to match a html tag with given attribute (eg. <div> with class="tab news selected" ) that contains one or more <a href> tags. The regexp should match the entire tag (from <div> to </div>). I always seem to get "memory exhausted" errors - my program probably takes ev...