tags:

views:

187

answers:

5

Character.isLetter(c) returns true if the character is a letter. But is there a way to quickly find if a String only contains the base characters of ASCII?

+3  A: 

Iterate through the string, and use charAt() to get the char. Then treat it as an int, and see if it has a unicode value (a superset of ASCII) which you like.

Break at the first you don't like.

Thorbjørn Ravn Andersen
+10  A: 

Using Guava you could just write:

boolean isAscii = CharMatcher.ASCII.matchesAllOf(someString);
ColinD
Ah, the wonders of abstraction layers :)
Thorbjørn Ravn Andersen
Nice one Colin.
org.life.java
-1 for suggesting 3rd party library for functionality available in the standard API as well (RealHowTo's answer)
jarnbjo
+1 Although it's good if you don't need another third-party library, Colin's answer is much shorter and much more readable. Suggesting third-party libraries is perfectly OK and should not be punished with a negative vote.
Jesper
I should also point out that CharMatchers are really incredibly powerful and can do waaaay more than this. In addition there are many more predefined CharMatchers besides ASCII, and great factory methods for creating custom ones.
ColinD
+10  A: 

You can do it with java.nio.charset.Charset.

import java.nio.charset.Charset;
import java.nio.charset.CharsetEncoder;

public class StringUtils {

  static CharsetEncoder asciiEncoder = 
      Charset.forName("US-ASCII").newEncoder(); // or "ISO-8859-1" for ISO Latin 1

  public static boolean isPureAscii(String v) {
    return asciiEncoder.canEncode(v);
  }

  public static void main (String args[])
    throws Exception {

     String test = "Réal";
     System.out.println(test + " isPureAscii() : " + StringUtils.isPureAscii(test));
     test = "Real";
     System.out.println(test + " isPureAscii() : " + StringUtils.isPureAscii(test));

     /*
      * output :
      *   Réal isPureAscii() : false
      *   Real isPureAscii() : true
      */
  }
}

Detect non-ASCII character in a String

RealHowTo
+5  A: 

Here is another way not depending on a library but using a regex.

You can use this single line:

text.matches("\\A\\p{ASCII}*\\z")

Whole example program:

public class Main {
    public static void main(String[] args) {
        char nonAscii = 0x00FF;
        String asciiText = "Hello";
        String nonAsciiText = "Buy: " + nonAscii;
        System.out.println(asciiText.matches("\\A\\p{ASCII}*\\z"));
        System.out.println(nonAsciiText.matches("\\A\\p{ASCII}*\\z"));
    }
}
Arne
+3  A: 

Iterate through the string and make sure all the characters have a value less than 128.

Java Strings are conceptually encoded as UTF-16. In UTF-16, the ASCII character set is encoded as the values 0 - 127 and the encoding for any non ASCII character (which may consist of more than one Java char) is guaranteed not to include the numbers 0 - 127

JeremyP