views:

178

answers:

6

Not quite sure how to word this question. I am wondering if there is a method to check certain parts of a custom java class to see if it matches a certain criteria. Such as this

public Name(String forename, String middlename, String surname)

And then when an array of instances of that class are created say,

Name[] applicants = new Name[4];

applicants[0] = new Name("john","bob", "rush");
applicants[1] = new Name("joe","bob", "rushden");
applicants[2] = new Name("jack","bob", "rushden");
applicants[3] = new Name("jake","bob", "rushden");

Is it possible to do a search over the instances of the class for person with

midddlename.equals("bob") && surname.equals("rush")

I am not really looking for a solution that is if(surname.equals("bob")) then else,etc

But more a in-built java class that allows for rapid searching over the array. the speed of this is very important.

A: 

If you need to search based on the object equality over array check apache common ArrayUtils, You basically have to override your equals and hascode for name object and use it, but if you want to use custom search criteria, I guess you have to implement your own way and there is no built in java language support

Teja Kantamneni
+9  A: 

There isn't built in support, but Apache Collections and Google Collections both provide Predicate support over collections.

You may find this question and its answers helpful. Same with this developer.com article.

e.g. Using Google Collections:

final Predicate<name> bobRushPredicate = new Predicate<name>() {
   public boolean apply(name n) {
      return "bob".equals(n.getMiddlename()) && "rush".equal(n.getSurname());
   }
}

final List<name> results = Iterables.filter(applicants, bobRushPredicate));
Mike
is there anywhere to get tutorials or more examples on using predicates from google colletions?
pie154
A: 

Use an in memory database like Apache Derby or hsqldb. Take advantage of JDBC, JPA, or Hibernate, which can all do what you want.

Profile your code. Then optimize.

Rob Spieldenner
A: 

Searching through an array and "speed is very important" don't really go together. Unless if your array will be very small then searching through an array will never be quick. This is the equivalent of a full table scan in a database, performance no matter how you go about it will be poor. The key to finding things quickly is to use an indexed structure. You can still have an array if you absolutely need it but the searching should be done using another data structure. Check out a Hash or Tree based collection since they organize data in a way that make it very fast to retrieve. TreeSet, TreeMap, HashSet, HashMap, etc. Hashes index data on a hashed key, Trees are similar but also store their data in a sorted order.

Just a minor note, but the structures you are referring to, Trees and Maps, are still based on arrays. They just use specific methods to search the array.
Matt
There are ways to search array faster, first - sorting an array once will speed up search from linear to logarithmic, second - it is one of the 'embarassingly parralelizable' problems (see my answer above).
ddimitrov
A: 

The faster way I can think of, is to create a data structure which mirrors this objects property values and hold the internal index for each value has.

When a value is searched, this internal data structure will return the index using binary search.

The only requirement is your object must register and update this structure.

Something like the following imaginary UML/Python like code:

 // Holds the index number of a given value
 // for instance, name="Oscar" may be at index 42...
 IndexValuePair
     index : Int
     value : String 

     +_ new( value: String, index: Int ) 
          return IndexValuePair( value, index )

 ValuePairComparator --> Comparator 

     + compareTo( a: IndexValuePair, b: IndexValuePair ) : Int 

         return a.value.compareTo( b.value )

 SearchStructure
     - data = Object[] // The original array which contains your applicants
      // a list of arrays each one containing the property value, and the index on "data" where that value appears 
     - dataIndexes =  List(IndexValuePair)[String] // Map<List<IndexValuePair>> 
     - dataIndexexInitialized = false

     // Add an object to this structure
     + addObject( o: Object ) 
          if( ! dataIndexesInitialized, 
              initIndexesWith( o )
          )

          index = data.add( o ) // returns the index at which "o" was inserted
          addToIndexes( o, index ) 

     // Register all the properties values of the given object 
     // along with the index where they appear in the original array 
     - addToIndexes( object: Object, index: Int ) 
           forEach( property in Object , 
              list = dataIndexes[property]
              list.add( IndexValuePair.new( property.value, index ) ) 
           )
     // Create empty array for each property .. 
     - initIndexesWith( object : Object ) 
          forEach( property in object , 
                comparator = ValuePairComparator()
                list = List<IndexValuePair>()
                list.setComparator(  ) 
                dataIndexes[property] =  list
          )
          dataIndexesInitialized = true 


     // Search an object using the given criteria ( a Map<String, String> = key=value ) 
     + search( criteria: String[String] ) : List<Object>

        result = Set<Object>()

        // let's say criteria has:
        // ["name":"Oscar", "lastName"="Reyes"]
       forEach( key in criteria, 
            list = dataIndexes[key]  // "name", "lastname" ..etc. 
            valuePair = list.binarySearch( criteria[key] ) // first Oscar, later Reyes 
            result.add( data[valuePair.index] )
       ) 

       return result

Oops

I hope this is understandable.

The point is, if you really what to have this really fast, you have to hold the indexes by property

  1. An array for the data
  2. An array for each property, which in turn would have the index of data

For instance if you have the following array:

 a = [ Object(name="Mike", lastName="Z" )
       Object(name="Oscar", lastName="Reyes" ) , 
       Object(name="Rahul", lastName="G" ) , 
       Object(name="Pie", lastName="154" )  ]

They would have the positions:

0 = Mike ... 
1 = Oscar ...
2 = Rahul ...
3 = Pie ...

And you'll have two ( in this case ) separate arrays which after being sorted would be:

nameArray =  ["Mike=0", "Oscar=1", "Pie=3", "Rahul=2"]

and

lastNameArray =   ["154=3", "G=2", "Reyes=1", "Z=0"]

When you search for a given attribute, you take the corresponding array, for instance, if you wan to search the last name "Reyes" you'll take "lastName" array

 ["154=3", "G=2", "Reyes=1", "Z=0"]

And will perform binarySearch on it for "Reyes" which will return the element at position 2, which in turn will return the index = 1 whih is the position "Oscar" has in the original array.

This should keep things under O(log n)

OscarRyz
A: 

Look at ParallelArray class, it satisfies your requirements, but you need to learn a bit of functional programming concepts to use it efficiently.

The class does not come with JDK 6, but might come with JDK 7 (under discussion). Meanwhile you can use it as a library - download the JSR166y package from: http://gee.cs.oswego.edu/dl/concurrency-interest/

See this tutorial for detailed explanation: http://www.ibm.com/developerworks/java/library/j-jtp03048.html

It might sound complicated and it is (if you arew just digging in high performance multi-threaded algorithms). There is a Groovy project which tries to wrap a more user-friendly API around Parallel Array, so you might want to ttake a look at it as well: http://gpars.codehaus.org/ , http://gpars.codehaus.org/Parallelizer

ddimitrov