views:

778

answers:

4

I have two lists ( not java lists, you can say two columns)

For example

**List 1**            **Lists 2**
  milan                 hafil
  dingo                 iga
  iga                   dingo
  elpha                 binga
  hafil                 mike
  meat                  dingo
  milan
  elpha
  meat
  iga                   
  neeta.peeta    

I'd like a method that returns how many elements are same. For this example it should be 3 and it should return me similar values of both list and different values too.

Should I use hashmap if yes then what method to get my result?

Please help

P.S: It is not a school assignment :) So if you just guide me it will be enough

A: 

Assuming hash1 and hash2

List< String > sames = whatever
List< String > diffs = whatever

int count = 0;
for( String key : hash1.keySet() )
{
   if( hash2.containsKey( key ) ) 
   {
      sames.add( key );
   }
   else
   {
      diffs.add( key );
   }
}

//sames.size() contains the number of similar elements.
Stefan Kendall
He wants the list of identical keys, not how many keys are identical. I think.
Rosdi
Thanks stefan for your help. Yeah Rosdi is correct and you as well. I need total number of similar values and similar values as well.
+2  A: 

Are these really lists (ordered, with duplicates), or are they sets (unordered, no duplicates)?

Because if it's the latter, then you can use, say, a java.util.HashSet<E> and do this in expected linear time using the convenient retainAll.

    List<String> list1 = Arrays.asList(
        "milan", "milan", "iga", "dingo", "milan"
    );
    List<String> list2 = Arrays.asList(
        "hafil", "milan", "dingo", "meat"
    );

    // intersection as set
    Set<String> intersect = new HashSet<String>(list1);
    intersect.retainAll(list2);
    System.out.println(intersect.size()); // prints "2"
    System.out.println(intersect); // prints "[milan, dingo]"

    // intersection/union as list
    List<String> intersectList = new ArrayList<String>();
    intersectList.addAll(list1);
    intersectList.addAll(list2);
    intersectList.retainAll(intersect);
    System.out.println(intersectList);
    // prints "[milan, milan, dingo, milan, milan, dingo]"

    // original lists are structurally unmodified
    System.out.println(list1); // prints "[milan, milan, iga, dingo, milan]"
    System.out.println(list2); // prints "[hafil, milan, dingo, meat]"
polygenelubricants
well I really don't know which data structure it should be. It has duplicates. Now you can see updated question
Will it remove the repeated values from data set? coz I don't want to loss any value :(
@agazerboy: I've tried to address both questions. Feel free to ask for more clarifications.
polygenelubricants
thanks poly. I tried your program with duplicates for example in first list i added "iga" two times but still it return me 3 as an answer. While it should be 4 now. coz list 1 has 4 similar values. If i added one entry multiple time it should work. What do you say? Anyother data structure?
@agazerboy: try the latest edit.
polygenelubricants
Thanks for your help. +1 :)
+5  A: 

EDIT

Here are two versions. One using ArrayList and other using HashSet

Compare them and create your own version from this, until you get what you need.

This should be enough to cover the:

P.S: It is not a school assignment :) So if you just guide me it will be enough

part of your question.

continuing with the original answer:

You may use a java.util.Collection and/or java.util.ArrayList for that.

The retainAll method does the following:

Retains only the elements in this collection that are contained in the specified collection

see this sample:

import java.util.Collection;
import java.util.ArrayList;
import java.util.Arrays;

public class Repeated {
    public static void main( String  [] args ) {
        Collection listOne = new ArrayList(Arrays.asList("milan","dingo", "elpha", "hafil", "meat", "iga", "neeta.peeta"));
        Collection listTwo = new ArrayList(Arrays.asList("hafil", "iga", "binga", "mike", "dingo"));

        listOne.retainAll( listTwo );
        System.out.println( listOne );
    }
}

EDIT

For the second part ( similar values ) you may use the removeAll method:

Removes all of this collection's elements that are also contained in the specified collection.

This second version gives you also the similar values and handles repeated ( by discarding them).

This time the Collection could be a Set instead of a List ( the difference is, the Set doesn't allow repeated values )

import java.util.Collection;
import java.util.HashSet;
import java.util.Arrays;

class Repeated {
      public static void main( String  [] args ) {

          Collection<String> listOne = Arrays.asList("milan","iga",
                                                    "dingo","iga",
                                                    "elpha","iga",
                                                    "hafil","iga",
                                                    "meat","iga", 
                                                    "neeta.peeta","iga");

          Collection<String> listTwo = Arrays.asList("hafil",
                                                     "iga",
                                                     "binga", 
                                                     "mike", 
                                                     "dingo","dingo","dingo");

          Collection<String> similar = new HashSet<String>( listOne );
          Collection<String> different = new HashSet<String>();
          different.addAll( listOne );
          different.addAll( listTwo );

          similar.retainAll( listTwo );
          different.removeAll( similar );

          System.out.printf("One:%s%nTwo:%s%nSimilar:%s%nDifferent:%s%n", listOne, listTwo, similar, different);
      }
}

Output:

$ java Repeated
One:[milan, iga, dingo, iga, elpha, iga, hafil, iga, meat, iga, neeta.peeta, iga]

Two:[hafil, iga, binga, mike, dingo, dingo, dingo]

Similar:[dingo, iga, hafil]

Different:[mike, binga, milan, meat, elpha, neeta.peeta]

If it doesn't do exactly what you need, it gives you a good start so you can handle from here.

Question for the reader: How would you include all the repeated values?

OscarRyz
@Oscar, My exact thought, but I was not sure if we could have modified the contents of `listOne`, but +1 anyways!
Anthony Forloney
You shouldn't use raw types.
polygenelubricants
@poygenelubricants what do you mean by *raw types* not generics? Why not?
OscarRyz
Oscar, did you see my updated question? Does it support repeated values?
@Oscar: http://java.sun.com/docs/books/jls/third_edition/html/typesValues.html#4.8 "The use of raw types in code written after the introduction of genericity into the Java programming language is strongly discouraged. It is possible that future versions of the Java programming language will disallow the use of raw types."
polygenelubricants
@agazerboy, just now, I'm updating the question to also include the "similar" values
OscarRyz
@polygenelubricants answer updated to handle duplicates and raw types. BTW, the *..future version of Java...* is never going to happen. ;)
OscarRyz
Hi Oscar, Thanks for your help. I think your first version works well. In your second version, if there are repeated values it skip them and show result as 3 similar values. While iga is 3 times in list one. So the total should be 5 or more for similar values. I hope my point is clear, what do you say?
So, do you mean that repeated values should appear? iga is repeated 6 times in the first list, should it show those 6 times? That's easy, just replace `HashSet` with `ArrayList` and you're done. BTW, **I think that with this help can can figure out the rest.**
OscarRyz
With that change ( s/HashSet/ArrayList ) you should get this output: http://pastebin.com/kxTWNeUx
OscarRyz
Oscar, thanks for your help. It is a very gud start for solving BIG PROBLEM :)
+1  A: 

Did you try intersection() and subtract() methods from CollectionUtils?

intersection() gives you a collection containing common elements and the subtract() gives you all the uncommon ones.

They should also take care of similar elements

Mihir Mathuria