views:

1333

answers:

2

I have to parse a line which is tab delimited. I parse it using the split function, and it works in most situations. The problem occurs when some field is missing, so instead of getting null in that field I get next value. I was store parsed values in a string array.

String[] columnDetail = new String[11];
columnDetail = column.split("\t");

So what are solution for this problem. If possible I'd like to store parsed strings into string array so that I can easily maintain my code

+1  A: 
    String str = "1\t2\t3\t4\t\t5";
    for(String s : str.split("\\t")){
        System.out.println(s);
    }

The code above is giving me an empty string in 4th index. You can change it to null yourself. The code is much more simple, elegant and efficient.

    String str = "1\t2\t3\t4\t\t\t\t\t\t5";
    for(String s : str.split("\\t")){
        System.out.println(s.equals("") ? null : s);
    }

After looking at your question, I found that you are in the impression that the split() method takes a simple string. But its not the case, it takes a Regex string. You need to pass "\\t", yes double of that.

Adeel Ansari
@Vinegar: A tab is a tab, just ONE character. How many spaces it occupies is subjective. Could be 2, could be 4 etc.
o.k.w
Yes, modified the post.
Adeel Ansari
Now the OP will accept your answer and in two hours, he will ask why \n doesn't work.
Filip Ekberg
I don't think so. Because he would soon realize after reading my post that, the regex he used was the problem. He will use `"\\n"` in the next case, and he is good to go.
Adeel Ansari
I wouldn't be so surprised if he actually asked the same question again, he has before.
Filip Ekberg
I believe, he just need to understand the basic of Regex. Isn't that true?
Adeel Ansari
Yep! I guess so!
Filip Ekberg
when the first field is null then all the value is shifted one by oneit will not create problem when middle filed is not found.so why when first field is not found then problem occurs
lakhaman
Problem occurs... where exactly?
Adeel Ansari
BTW, are you not reading the answers?
Adeel Ansari
*Grin* told you ;)
Filip Ekberg
Actually, I was able to get what you mean, after a little while. :)
Adeel Ansari
+4  A: 

String.split uses Regular Expressions, also you don't need to allocate an extra array for your split.

The split-method will give you a list., the problem is that you try to pre-define how many occurrences you have of a tab, but how would you Really know that? Try using the Scanner or StringTokenizer and just learn how splitting strings work.

Let me exaplin Why \t does not work and why you need \\\\ to escape \\.

Okay, so when you use Split, it actually takes a regex ( Regular Expression ) and in regular expression you want to define what Character to split by, and if you write \t that actually doesn't mean \t and what you WANT to split by is \t, right? So, by just writing \t you tell your regex-processor that "Hey split by the character that is escaped t" NOT "Hey split by all characters looking like \t". Notice the difference? Using \ means to escape something. And \ in regex means something Totaly different than what you think.

So this is why you need to use this Solution:

\\t

To tell the regex processor to look for \t. Okay, so why would you need two of em? Well, the first \ escapes the second, which means it will look like this: \t when you are processing the text!

Now let's say that you are looking to slipt \

Well then you would be left with \\ but see, that doesnt Work! because \ will try to escape the previous char! That is why you want the Output to be \\ and therefore you need to have \\\\.

I really hope the examples above helps you understand why your solution doesn't work and how to conquer other ones!

Now, I've given you this answer before, maybe you should start looking at them now.

OTHER METHODS

StringTokenizer

You should look into the StringTokenizer, it's a very handy tool for this types of work.

Example

 StringTokenizer st = new StringTokenizer("this is a test");
 while (st.hasMoreTokens()) {
     System.out.println(st.nextToken());
 }

This will putput

 this
 is
 a
 test

You use the Second Constructor for StringTokenizer to set the delimiter:

StringTokenizer(String str, String delim)

Scanner

You could also use a Scanner as one of the commentators said this could look somewhat like this

Example

 String input = "1 fish 2 fish red fish blue fish";

 Scanner s = new Scanner(input).useDelimiter("\\s*fish\\s*");

 System.out.println(s.nextInt());
 System.out.println(s.nextInt());
 System.out.println(s.next());
 System.out.println(s.next());

 s.close();

The output would be

 1
 2
 red
 blue

Meaning that it will cut out the word "fish" and give you the rest, using "fish" as the delimiter.

examples taken from the Java API

Filip Ekberg
@Filip: nice one!
o.k.w
Regular expressions shouldn't bite you when splitting at tab, though.
Joey
Probably not, but if the OP just would Try to read answers and understand them, he would already know the answer to this. Because this is simmilar to what he posted yesterday. I would say that IF he used my method yesterday and today, he wouldn't have gotten this problem.
Filip Ekberg
I've added some more to clearify why it doesn't work to split by \t. hth.
Filip Ekberg
@Filip i have to parse xml file which has commen header field and thenmultiple data fields so if i use stringtokenizer then i can't determined that which field is null. yesterday i have raised problem for text file while today it for XML file.that's why i must have to use split function
lakhaman
You are looking on the problem totaly wrong or you are asking the wrong type of question. I would suggest that instead of involving parsers and stuff to read the XML. Just start simple. Please provide us with an Example and if there is no way for you to use the information provided by me ( which i find doubtfull ), well then theres not much i can do for you.
Filip Ekberg
Parsing XML with regular expressions is always wrong.
Geoffrey Chetwood