views:

53

answers:

2

Hello Friends,

I have an app that is a little bit slow. I thoutght it could be faster using threads.

So, here is my plan: My program have a list of objects of type X and each object X has a very big list of Integers (let's consider Integer for the sake of simplicity).

I have a static method (called getSubsetOfX) that receives a object X from the list of X's and return a list of Integers of the object X the list returned is a subset of all the Integers contained in X.

This method is called for every X contained in the list. Then I insert the returned List in a List of Integer Lists.

This is the code I explained in a compact version:

// Class of object X
public class X{
    public List<Integer> listX;
    ...
}

// Utility class
public class Util{
    // Return a sub-set of Integer contained in X
    public static List<Integer> getSubsetOfX(X x){...}
}


public class exec{
    public static void main(String args[]){
        // Let's suppose that lx is already filled with data!
        List<X> lx = new ArrayList<X>();

        // List of the subsets of integer
        List<List<Integer>> li = new ArrayList<ArrayList<Integer>>();

        for(X x : lx){
            // I want to turn this step "threadrized"
            li.add(getSubsetOfX(x));
        }
    }
}

I don't know if a List allow concurrent insertions. I don't know how to apply threads in it too. I read some about Threads, but, as the run() method doesn't return anything, how can turn the method getSubsetOfX(X x) parallel?

Can you help me doing this?

+1  A: 

I don't know if a List allow concurrent insertions.

See Class ArrayList:

Note that this implementation is not synchronized. If multiple threads access an ArrayList instance concurrently, and at least one of the threads modifies the list structurally, it must be synchronized externally. (A structural modification is any operation that adds or deletes one or more elements, or explicitly resizes the backing array; merely setting the value of an element is not a structural modification.) This is typically accomplished by synchronizing on some object that naturally encapsulates the list. If no such object exists, the list should be "wrapped" using the Collections.synchronizedList method. This is best done at creation time, to prevent accidental unsynchronized access to the list:

List list = Collections.synchronizedList(new ArrayList(...));

But be careful: Synchronization comes with a significant performance cost. This could relativity the performance you get by using multiple threads (especially when the calculations are quite fast do do).
Thus, avoid accessing those synchronized collections wherever possible. Prefer thread-local lists instead which you can then merge with your shared list using AddAll.

winSharp93
Hi @winSharp93. Do you have some tip to help me turn this step in the program parallel? Regards
marionmaiden
Well - that really depends on your getSubsetOfX. However, using David Zaslavsky's example should be fine in many cases.
winSharp93
+3  A: 

Just to be clear, getSubsetOfX() is the call that takes a long time, right?

For this sort of task, I'd suggest you look at Java's Executors. The first step would be to create a Callable that runs getSubsetOfX(x) on a given instance of X. Something like this:

public class SubsetCallable implements Callable<List<Integer>> {
    X x;
    public SubsetCallable(X x) {
        this.x = x;
    }
    public List<Integer> call() {
        return Util.getSubsetOfX(x);
    }
}

Then you can create an ExecutorService using one of the methods in Executors. Which method to use depends on your available resources and your desired execution model - they're all described in the documentation. Once you create the ExecutorService, just create a SubsetCallable for each instance of X that you have and pass it off to the service to run it. I think it could go something like this:

ExecutorService exec = ...;
List<SubsetCallable> callables = new LinkedList<SubsetCallable>();
for (X x : lx) {
    callables.append(new SubsetCallable(x));
}
List<Future<List<Integer>>> futures = exec.invokeAll(lc);
for (Future<List<Integer>> f : futures) {
    li.add(f.get());
}

This way you can delegate the intense computation to other threads, but you still only access the list of results in one thread, so you don't have to worry about synchronization. (As winsharp93 pointed out, ArrayList, like most of Java's standard collections, is unsynchronized and thus not safe for concurrent access.)

David Zaslavsky
Yes @David, the slow method is getSubsetOfX(X x). Does this exec.invokeAll() runs the Callables parallel or linnear? Thanks
marionmaiden
It depends on which `ExecutorService` implementation you use. In general, I would expect it to invoke them in parallel, up to the number of concurrent tasks the `ExecutorService` allows. For instance, if you use a fixed thread pool with 3 threads, it should invoke the first 3 Callables in parallel, and whenever one of them finishes, the thread that was executing it takes the next Callable from the list. So there would always be 3 tasks running at any given time, as long as there are 3 or more tasks left to run.
David Zaslavsky