views:

127

answers:

5

Hi,

I'm a complete noob to regex and I need help with splitting a string. I am inputing the following data

665  11% R     1    908K    388K  fg root     top
 61   1% S    42 152404K  29716K  fg system   system_server
 38   0% S     1    840K    340K  fg root     /system/bin/qemud
114   0% S    16 120160K  19156K  fg radio    com.android.phone

which is nothing but your regular top output. What I intend to do is select on entries like

655 11% R 1 fg root top

Now the code which I use to do the following is

while ((inputLine = in.readLine()) != null) 
{
  String[] segs= inputLine.split("[ ]+");
  str[i] = segs[0]+" "+segs[1]+" "+segs[2]+" "+
           segs[3]+" "+segs[6]+" "+segs[7]+" "+segs[8];
  Log.v("TOP Output", str[i]);
  i++; j++;
}

But the problem I face is, that I get on logcat is

java.lang.ArrayIndexOutOfBoundsException

Where am I going wrong, and what could I do different to prevent this. Thanks for helping.

EDIT: After reading the comments I realize i have a couple of empty line in my output. So in such a case how am I supposed to ignore those line. I know I am supposed to match a case, but I am not sure about the expression or syntax!

A: 

Forget about it... I'm too dump to count to 9. :-(

yas4891
I am sorry, I don't get ur point!
Shouvik
@Shouvik Before I edited my post I claimed that there are only 8 elements in the array, but this was due to me not being able to count to 9 :(
yas4891
Oh lol, I assumed dump actually had some dump significance.. I guess I am too "dump" too... :P
Shouvik
oh. no that is to be tributed to the fact, that I'm a non-native English speaker and too dumb to even write 'dumb' correctly :-(
yas4891
A: 

be careful at str instantiation, how big is it, because it's an array. You should use a list or anything else, because you don't know how many lines you have in your input.

Csaryus
I suppose str it's an array of String
Csaryus
You are right str is an array of strings..
Shouvik
+2  A: 

Use the following regexp, and check for array's length, every line! And also consider using a StringBuilder or StringBuffer instead of concatenating.

 String[] s = inputLine.split("[\\s\\t]+");
naikus
If I have an empty line, how will this help me reject this line in my string entry?
Shouvik
naikus
FYI, you don't need to match `\t` explicitly; `\s` has that covered.
Alan Moore
@Alan Moore I thought the behaviour was that too. But when i tested it against the string "665 11% R 1 908K 388K fg root top", it returned me 25 strings, but later i realized that i had forgotton to add the "+" sign :) Thanks!
naikus
+2  A: 

You don't need the character class (square brackets). Space is a regular character in regex, so:

String[] segs = inputLine.split(" +");

Other than that, assuming array indices are there without range checking is bad style and an ArrayIndexOutOfBoundsException is just what you've asked for.

Better do it explicitly:

String re = "^\\s*(\\S+)\\s+(\\S+)\\s+(\\S+)\\s+(\\S+)\\s+(\\S+)\\s+(\\S+)\\s+(\\S+)\s+(\\S+)\\s+(\\S+)\\s*$";
Pattern p = Pattern.compile(re, Pattern.MULTILINE);
Matcher m = p.matcher(yourInputString);

while (m.find())
{
   // do stuff with m.group(1) through m.group(9)
}

This way it is guaranteed that every line you match fulfills your expectations and every matcher group contains what you expect, too.

Disclaimer: I'm not especially proud of that regex. It's quite an ugly one, actually, but it illustrates the point that explicit is more reliable and predictable than implicit. And it has the potential to be improved into a version that matches the desired parts even more accurately than a string split ever could.

Tomalak
I am sorry I don't exactly understand understand what am I doing in your find function. I am pretty new to java, and am just picking up as I go along. Also does this take care of blank line, I would like to eliminate them from my result...
Shouvik
Also I got an invalid sequence error for String re in eclipse. What am I doing wrong!?
Shouvik
Whoops. These backslashes must be escaped. My bad, see corrected answer. Look at http://www.regular-expressions.info/java.html for more reading on Java Regexes.
Tomalak
Yeah thanks, I got that, some more errors were there, I will put the code in the question. But selected your answer! thanks, shouvik...
Shouvik
+2  A: 

How consistent is this output? Is there always a value in every column? If so, try this:

line = line.replaceFirst("(?:\s+\d+[KM]?){3}", "");

You don't have to worry about blank lines with this approach, because the regex doesn't match them.

Alan Moore