views:

196

answers:

2

I need a suggestion for an efficient data structure for the following problem.

I have two lists of students (male and female) with their respective classes (sorted by date) they have already taken. The list is already alphabetized with their last name, first name.

A user will give a student name, ie. student X, and what the program needs to do is to find which male or female student shares the most classes with student X.

+2  A: 

It sounds like you need Associative Arrays. An associative array maps a key object of some type to another object, possibly of a different type. It is often just called a "map" since that is what it does. A map can be implemented by a Hash Table, which will mean that you have constant time look up for your key -> object mappings. I'd actually suggest 2 hash tables for this problem. The first to map student names to lists of classes taken. The second to map class names to students who took the class. Your student -> class lookups will be really fast, and your class -> list-of-student lookups will be fast as well. In addition, when you are processing a particular student X, you can use a third to map a student name to an integer counting how many times they shared a class with student X. This will end up being a pretty efficient implementation.

Best of all, it may end up being a really easy implementation. Relational mapping is such a common task and associative arrays are so useful, that many languages have them built in, or in the standard library. Python has its dictionary, Perl has its hash, Java has a HashMap (and many other types of maps), C++ has std::map, though that is not backed by a hash table and doesn't have constant time access. Unless you are forbidden to use your languages standard library for this exercise, you shouldn't have too much trouble getting an efficient solution running.

A. Levy
+1 for the description of the hash table, but I'm not sure I'd use the second map for this problem.
joeslice
Any specific reason why you'd not the second map? You'd need an efficient lookup of who is taking a particular class, so using a map for that is a logical choice.
aberrant80
A: 

I would do this:

class Clazz {
    Date when;
    String name;
} 
class Student {
    boolean isFemale;
    String name;
    Map<Date, Clazz> classesTaken;
}
List<Student> students = ...

public Student findMostShared(Student who) {
    Student max = null;
    int maxCount = 0;
    for (Student s : students) {
        if (s == who) { continue; }
        int c = 0;
        for (Clazz z : s.classesTaken.values()) {
            if (who.classesTaken.containsKey(z.when)) {
                c++;
            }
        }
        if (c > maxCount) {
            maxCount = c;
            max = s;
        }
    }
    return max;
}
kd304