views:

101

answers:

3

There is input file with content:
XX00002200000
XX00003300000

regexp:

(.{6}22.{5}\W)(.{6}33.{5})

Tried in The Regex Coach(app for regexp testing), strings are matched OK.

Java:

        pattern = Pattern.compile(patternString);
        inputStream = resource.getInputStream();

        scanner = new Scanner(inputStream, charsetName);
        scanner.useDelimiter("\r\n");

patternString is regexp(mentioned above) added as bean property from .xml

It's failed from Java.

+1  A: 

Simple solution: ".{6}22.{5}\\s+.{6}33.{5}". Note that \s+ is a shorthand for consequent whitespace elements.

Heres an example:

 public static void main(String[] argv) throws FileNotFoundException {
  String input = "yXX00002200000\r\nXX00003300000\nshort", regex = ".{6}22.{5}\\s+.{6}33.{5}", result = "";
  Pattern pattern = Pattern.compile(regex);
  Matcher m = pattern.matcher(input);

  while (m.find()) {
   result = m.group();
   System.out.println(result);
  }
 }

With output:

XX00002200000
XX00003300000

To play around with Java Regex you can use: Regular Expression Editor (free online editor)

Edit: I think that you are changing the input when you are reading data, try:

public static String readFile(String filename) throws FileNotFoundException {
    Scanner sc = new Scanner(new File(filename));

    StringBuilder sb = new StringBuilder();
    while (sc.hasNextLine())
        sb.append(sc.nextLine());
    sc.close();

    return sb.toString();
}

Or

static String readFile(String path) {
    FileInputStream stream = null;
    FileChannel channel = null;
    MappedByteBuffer buffer = null;

    try {
        stream = new FileInputStream(new File(path));
        channel = stream.getChannel();
        buffer = channel.map(FileChannel.MapMode.READ_ONLY, 0,
                channel.size());
    } catch (Exception e) {
        e.printStackTrace();
    } finally {
        try {
            stream.close();
        } catch (Exception e2) {
            e2.printStackTrace();
        }
    }

    return Charset.defaultCharset().decode(buffer).toString();
}

With imports like:

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.charset.Charset;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
Margus
Hi Margus.It's interesting thing, i tried regexp you proposed:.{6}22.{5}\\s+.{6}33.{5} in "The Regex Coach" app. It works OK(without one backslash). From Java id doesn't work,very strange.
sergionni
i mean from my .xml
sergionni
should i convert InputStrem to FileInputStream somehow in order to call getChannel() method?
sergionni
A: 

Try this change in delimiter:

 scanner.useDelimiter("\\s+");

also why don't you use a more general regex expression like this :

 ".{6}[0-9]{2}.{5}"

The regex you have mentioned above is for 2 lines.Since you have mentioned the delimiter as a new line you should be giving a regex expression suitable for a single line.

Emil
thank you for answer, this regexp needed for extracting definite string buffer from message queue, this string buffer starts from string with 22 and ends with string with 33. And actually,between these strings will be strings of similar structure,delimited with LR or LF also.
sergionni
I didn't quite understand.Did my answer help you?If not please explain the above problem in detail by editing your question.
Emil
\\s didn't helped
sergionni
A: 

Pardon my ignorance, but I am still not sure what exactly are you trying to search. In case, you are trying to search for the string (with new lines)

XX00002200000
XX00003300000

then why are you reading it by delimiting it by new lines?

To read the above string as it is, the following code works

Pattern p = Pattern.compile(".{6}22.{5}\\W+.{6}33.{5}");

 FileInputStream scanner = null;
        try {
            scanner = new FileInputStream("C:\\new.txt");
            {
                byte[] f = new byte[100];
                scanner.read(f);
                String s = new String(f);
                Matcher m = p.matcher(s);
                if(m.find())
                    System.out.println(m.group());
            }
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }

NB: here new.txt file contains the string

XX00002200000
XX00003300000
Gaurav Saxena
how to use scanner with InputStream? in case of scanner = new Scanner(inputStream, charsetName),it doesn't support method read
sergionni
I am not sure why is it so necessary for you to use Scanner to read from a file but if that is so then its best to use a delimiter which will not be found in the file e.g. scanner.useDelimiter("\\?"); It will prompt scanner to get the whole String from the file
Gaurav Saxena