views:

414

answers:

6

Hi all,
Just a quick question really. Say I have a 10x10 array in Java, some of the places in which are not used, and I'm needing to go through all elements as part of a method. Would it be better to:

  1. Go through all elements with 2 for loops and check for the nulltype to avoid errors, e.g.

    for(int y=0;y<10;y++){
        for(int x=0;x<10;x++){
           if(array[x][y]!=null)
                //perform task here
        }
    }
    
  2. Or would it be better to keep a list of all the used addresses... Say an arraylist of points?

  3. Something different I haven't mentioned.

Thanks all, I look forward to any answers :)

+1  A: 

You're better off using a List than an array, especially since you may not use the whole set of data. This has several advantages.

  1. You're not checking for nulls and may not accidentally try to use a null object.
  2. More memory efficient in that you're not allocating memory which may not be used.
AlbertoPL
I opted for a multi-dimensional array as all places most likely will be used. The work I'm doing is for the Android platform and Object creation is quite expensive so its best to reserve the space up front for real time performance.
Laurence Dawson
With a modern JVM, null checks are very fast, so this is not a problem. He only needs to remember to check for them.
quant_dev
The code will be run on the Dalvik virtual machine.
Laurence Dawson
Ok, I see, in that case, you're dealing with the classic space vs performance. If you need the performance just swallow the hit.
AlbertoPL
On the other hand, code branches may decrease performance, so he may be better off with a solution which avoids conditional execution of the code.
quant_dev
Real time performance is key here, space is not an issue.
Laurence Dawson
I recommend editing the question to say you're optimizing for performance, just so people don't have to dig through these comments to see the type of optimization you're looking for.
AlbertoPL
A: 

Holding an ArrayList of points would be "over engineering" the problem. You have a multi-dimensional array; the best way to iterate over it is with two nested for loops. Unless you can change the representation of the data, that's roughly as efficient as it gets.

Just make sure you go in row order, not column order.

Malaxeur
My thoughts exactly, especially as to add or delete from an ArrayList copies the contents to new Array...
Laurence Dawson
A: 

Depends on how sparse/dense your matrix is.

If it is sparse, you better store a list of points, if it is dense, go with the 2D array. If in between, you can have a hybrid solution storing a list of sub-matrices.

This implementation detail should be hidden within a class anyway, so your code can also anytime convert between any of these representations.

I would discourage you from settling on any of these solutions without profiling with your real application.

Zed
Most likely going to be dense, its for a Puzzle Bobble clone for the Android platform.
Laurence Dawson
+1  A: 

For a hundred elements, it's probably not worth using any of the classic sparse array implementations. However, you don't say how sparse your array is, so profile it and see how much time you spend skipping null items compared to whatever processing you're doing.

( As Tom Hawtin - tackline mentions ) you should, when using an array of arrays, try to loop over members of each array rather than than looping over the same index of different arrays. Not all algorithms allow you to do that though.

for ( int x = 0; x < 10; ++x ) {
    for ( int y = 0; y < 10; ++y ) {
       if ( array[x][y] != null )
            //perform task here
    }
}

or

for ( Foo[] row : array ) {
    for ( Foo item : row ) {
       if ( item != null )
            //perform task here
    }
}

You may also find it better to use a null object rather than testing for null, depending what the complexity of the operation you're performing is. Don't use the polymorphic version of the pattern - a polymorphic dispatch will cost at least as much as a test and branch - but if you were summing properties having an object with a zero is probably faster on many CPUs.

double sum = 0;

for ( Foo[] row : array ) {
    for ( Foo item : row ) {
       sum += item.value();
    }
}

As to what applies to android, I'm not sure; again you need to test and profile for any optimisation.

Pete Kirkham
+1 for telling Dawson to profile :)
Soonil
A: 

I agree an array with a null test is the best approach unless you expect sparsely populated arrays.

Reasons for this:

1- More memory efficient for dense arrays (a list needs to store the index)

2- More computationally efficient for dense arrays (You need only compare the value you just retrieved to NULL, instead of having to also get the index from memory).

Also, a small suggestion, but in Java especially you are often better off faking a multi dimensional array with a 1D array where possible (square/rectangluar arrays in 2D). Bounds checking only happens once per iteration, instead of twice. Not sure if this still applies in the android VMs, but it has traditionally been an issue. Regardless, you can ignore it if the loop is not a bottleneck.

patros
+3  A: 

Any solution you try needs to be tested in controlled conditions resembling as much as possible the production conditions. Because of the nature of Java, you need to exercise your code a bit to get reliable performance stats, but I'm sure you know that already.

This said, there are several things you may try, which I've used to optimize my Java code with success (but not on Android JVM)

for(int y=0;y<10;y++){
    for(int x=0;x<10;x++){
       if(array[x][y]!=null)
            //perform task here
    }
}

should in any case be reworked into

for(int x=0;x<10;x++){
    for(int y=0;y<10;y++){
       if(array[x][y]!=null)
            //perform task here
    }
}

Often you will get performance improvement from caching the row reference. Let as assume the array is of the type Foo[][]:

for(int x=0;x<10;x++){
    final Foo[] row = array[x];
    for(int y=0;y<10;y++){
       if(row[y]!=null)
            //perform task here
    }
}

Using final with variables was supposed to help the JVM optimize the code, but I think that modern JIT Java compilers can in many cases figure out on their own whether the variable is changed in the code or not. On the other hand, sometimes this may be more efficient, although takes us definitely into the realm of microoptimizations:

Foo[] row;
for(int x=0;x<10;x++){
    row = array[x];
    for(int y=0;y<10;y++){
       if(row[y]!=null)
            //perform task here
    }
}

If you don't need to know the element's indices in order to perform the task on it, you can write this as

for(final Foo[] row: array){
    for(final Foo elem: row
       if(elem!=null)
            //perform task here
    }
}

Another thing you may try is to flatten the array and store the elements in Foo[] array, ensuring maximum locality of reference. You have no inner loop to worry about, but you need to do some index arithmetic when referencing particular array elements (as opposed to looping over the whole array). Depending on how often you do it, it may or not be beneficial.

Since most of the elements will be not-null, keeping them as a sparse array is not beneficial for you, as you lose locality of reference.

Another problem is the null test. The null test itself doesn't cost much, but the conditional statement following it does, as you get a branch in the code and lose time on wrong branch predictions. What you can do is to use a "null object", on which the task will be possible to perform but will amount to a non-op or something equally benign. Depending on the task you want to perform, it may or may not work for you.

Hope this helps.

quant_dev
Cheers, always good to see another opinion on this sorta thing just to be sure. Thanks to everyone else too!
Laurence Dawson
I consider it somewhat dangerous to give this advice without any profiling. Making broad assertions about the optimizing behavior of the JVM is a recipe for being wrong. My understanding is that 'final' is no longer a significant optimization, the JVM is capable of the same optimization even without the keyword. I am also skeptical that Dawson will realize a gain by removing the if-null check. It would really have to be bare-bones code to overcome the cost of a BNE especially since the element will -not- usually be null. Bottom line, profile first, assertions second :)
Soonil
Did you read the first two sentences of my answer, Soonil?
quant_dev
PS. Same question about the lest paragraph... I wrote that the null check itself should be cheap.
quant_dev