ansaurus

Question

Java: Collection RemoveAll Ignoring Case?

Answer 1

+1 A:

You can use a hashmap and use the capital set as keys that map to the mixed case set.

Keys of hashmaps are unique and you can get a set of them using HashMap.keyset();

to retrieve the original case, it's as simple as HashMap.get("UPPERCASENAME").

And according to the documentation:

Returns a set view of the keys contained in this map. The set is backed by the map, so changes to the map are reflected in the set, and vice-versa. The set supports element removal, which removes the corresponding mapping from this map, via the Iterator.remove, Set.remove, removeAll, retainAll, and clear operations. It does not support the add or addAll operations.

So HashMap.keyset().removeAll will effect the hashmap :)

EDIT: use McDowell's solution. I overlooked the fact that you didn't actually need the letters to be upper case :P

Charles Ma 2009-08-06 21:09:42

Answer 2

A:

as far as i know, hashset's use the object's hashCode-method to distinct them from each other. you should therefore override this method in your object in order to distinct cases.

if you're really using string, you cannot override this method as you cannot extend the String-class.

therefore you need to create your own class containing a string as attribute which you fill with your content. you might want to have a getValue() and setValue(String) method in order to modify the string.

then you can add your own class to the hashmap.

this should solve your problem.

regards

Atmocreations 2009-08-06 21:10:29

Answer 3

+7 A:

In my original answer, I unthinkingly suggested using a Comparator, but this causes the TreeSet to voilate the equals contract and is a bug waiting to happen:

// Don't do this:
Set<String> setA = new TreeSet<String>(String.CASE_INSENSITIVE_ORDER);
setA.add("hello");
setA.add("Hello");
System.out.println(setA);

Set<String> setB = new HashSet<String>();
setB.add("HELLO");
// Bad code; violates symmetry requirement
System.out.println(setB.equals(setA) == setA.equals(setB));

It is better to use a dedicated type:

public final class CaselessString {
  private final String string;
  private final String normalized;

  private CaselessString(String string, Locale locale) {
    this.string = string;
    normalized = string.toUpperCase(locale);
  }

  @Override public String toString() { return string; }

  @Override public int hashCode() { return normalized.hashCode(); }

  @Override public boolean equals(Object obj) {
    if (obj instanceof CaselessString) {
      return ((CaselessString) obj).normalized.equals(normalized);
    }
    return false;
  }

  public static CaselessString as(String s, Locale locale) {
    return new CaselessString(s, locale);
  }

  public static CaselessString as(String s) {
    return as(s, Locale.ENGLISH);
  }

  // TODO: probably best to implement CharSequence for convenience
}

This code is less likely to cause bugs:

Set<CaselessString> set1 = new HashSet<CaselessString>();
set1.add(CaselessString.as("Hello"));
set1.add(CaselessString.as("HELLO"));

Set<CaselessString> set2 = new HashSet<CaselessString>();
set2.add(CaselessString.as("hello"));

System.out.println("1: " + set1);
System.out.println("2: " + set2);
System.out.println("equals: " + set1.equals(set2));

This is, unfortunately, more verbose.

McDowell 2009-08-06 21:16:59

No need to roll your own Comparator. The String class provides one for you: http://java.sun.com/javase/6/docs/api/java/lang/String.html#CASE_INSENSITIVE_ORDER

banjollity 2009-08-06 22:04:33

@bankollity. Thanks! - that's been there since Java 1.2 and I've never noticed it. Code amended.

McDowell 2009-08-07 09:13:04

Wow that was extremely simple to implement although the documentation leads you to believe that the comparator is used solely for sorting. TreeSet(Comparator c) : Constructs a new, empty set, sorted according to the specified comparator. http://java.sun.com/j2se/1.4.2/docs/api/java/util/TreeSet.html#TreeSet%28java.util.Comparator%29 . I am glad it worked though, Thank you so much for your response!

2009-08-07 13:42:19

I should perhaps note that String.CASE_INSENSITIVE_ORDER should be OK for English-language identifiers, but isn't something you should rely on for non-English comparison.

McDowell 2009-08-26 14:12:41

I'm not sure what I was thinking when I wrote this answer - this violates the contract of `Set`. I'll have to come back with a better solution.

McDowell 2009-12-21 13:07:48

+1 for providing another answer which doesn't only do what needs to be done, but is also correct (as in obeys requirements).

extraneon 2009-12-22 10:25:03

Answer 4

A:

This would be an interesting one to solve using google-collections. You could have a constant Predicate like so:

private static final Function<String, String> TO_UPPER = new Function<String, String>() {
    public String apply(String input) {
       return input.toUpperCase();
}

and then what you're after could be done someting like this:

Collection<String> toRemove = Collections2.transform(list2, TO_UPPER);

Set<String> kept = Sets.filter(list1, new Predicate<String>() {
    public boolean apply(String input) {
        return !toRemove.contains(input.toUpperCase());
    }
}

That is:

Build an upper-case-only version of the 'to discard' list
Apply a filter to the original list, retaining only those items whose uppercased value is not in the upper-case-only list.

Note that the output of Collections2.transform isn't an efficient Set implementation, so if you're dealing with a lot of data and the cost of probing that list will hurt you, you can instead use

Set<String> toRemove = Sets.newHashSet(Collections2.transform(list2, TO_UPPER));

which will restore an efficient lookup, returning the filtering to O(n) instead of O(n^2).

Cowan 2009-12-22 10:20:26

ansaurus

tags:

views:

answers:

Java: Collection RemoveAll Ignoring Case?

related questions