ansaurus

Question

how to remove whitespace while scanning text in java

Answer 1

A:

Use java.util.Scanner.

EJP 2010-02-21 05:43:27

I'm getting the same amount of whitespace tokens - roughly 14k for my test input - with Scanner as with String.split.

2010-02-21 06:10:31

You shouldn't be getting any 'whitespace tokens'. Whitespace isn't a token, it is the stuff in between tokens. java.util.Scanner gives you the opportunity to define what your tokens are and what your delimiters are i.e. what your whitespace is. Don't waste its time and yours by making it return whitespace to you.

EJP 2010-02-21 22:12:32

Answer 2

+1 A:

I'm not sure what you are talking about. For example,

String[] parts = "the quick    brown   fox".split("\\s+");

correctly tokenizes the string with no leading or trailing whitespaces on any token, and no empty tokens. If the input string may have leading or trailing whitespaces, then calling String.trim() will remove the possibility of empty tokens.

EDIT I surmise from your other comment that you are reading the input a line at a time and then tokenizing the lines. You probably need to trim each line before tokenizing.

Stephen C 2010-02-21 06:12:58

ansaurus

tags:

views:

answers:

how to remove whitespace while scanning text in java

related questions