tags:

views:

210

answers:

2

I want to mimic a Python functionality in Java. In Python if I want unique characters in a string I can do just,

text = "i am a string"
print set(text) # o/p is set(['a', ' ', 'g', 'i', 'm', 'n', 's', 'r', 't'])

How can I do this in Java trivially or directly?

+5  A: 
String str = "i am a string";
System.out.println(new HashSet<String>(Arrays.asList(str.split(""))));

EDIT: For those who object that they aren't exactly equivalent because str.split will include an empty string in the set, we can do it even more verbose:

String str = "i am a string";
Set<String> set = new HashSet<String>(Arrays.asList(str.split("")));
set.remove("");
System.out.println(set);

But of course it depends on what you need to accomplish.

Yishai
python wins 4 terseness
Claudiu
This doesn't quite work because String.split("") includes an empty string in the result e.g. "hello".split("") is '', 'h', 'e', 'l', 'l', 'o'.
mikej
@mikej well technically the output is '', 'h', 'e', 'l', 'o'. But I get your point. The current solution works for me though because I further process the result of this operation.
Chantz
@Chantz I was giving the output from the split before it goes into the Set to remove duplicates. Thanks.
mikej
Oh, yeah! my bad!
Chantz
A: 

@Claudiu Unless you want to include the empty string I guess? Terse is not a term that I want attached to my language... I look for code that is Clear, Explicit, Consistent, minimal syntax and an ability to write close to fully factored code. Terse would be on the "other" list.

Bill K
For me, saying set("i am a string") is much clearer, explicit, and minimalist than the Java equivalent. The Java uses four separate classes, three methods, generics, etc. The python code uses one function, which also happens to be a constructor.
Claudiu
My issue with it is that passing a string to a set is not obviously supposed to split the string into individual strings of characters. That is far from clear and certainly not explicit. If you wanted to make the java more terse, you could drop the generics. I would agree though that the lack of obvious way to split a string into an object representing each character is a little annoying, with the String.split leaving an artifact you almost certainly didn't want.
Yishai
set("i am string") tells me nothing (I still don't have a clue what it might mean--setting something in the current Object I guess? Where the hell did the set method even come from?) Perhaps it's just random stupid language syntax that means "Break a string down into characters and delete ones that you don't like". I'd take 3 lines that actually told me what it was doing. Terse is not clear or explicit, it's simply minimalist. Not duplicating your self--awesome. Organizing your code into understandable groups and isolated concepts--fantastic! Reading or typing less--meh.
Bill K