views:

112

answers:

7

I'm trying to create 5 ArrayLists of various size, fill them with random numbers between 0 and 1, and then time (and print) how long it takes to iterate through each.

I think I've initialized and filled them correctly. But when I iterate through and check the time, the numbers are off. I get something like (0, 0, 2, 0, 0 seconds) when they should be increasing.

Here's the code, any ideas?

    import java.util.*;    // Imports all java.util subclasses.
public class Lab2 {

  public static void main (String[] args)
  {
    System.out.println("#2");

    int size1 = 100;
    int size2 = 1000;
    int size3 = 10000;
    int size4 = 100000;
    int size5 = 1000000;

    // ArrayList 100.
    ArrayList<Double> arrList1 = new ArrayList<Double>();
    for (int i = 0; i < size1; i++)
    {
        arrList1.add(Math.random());
    }

    long startTime11 = System.currentTimeMillis();
    for (int i = 0; i < arrList1.size(); i++);
    long total11 = System.currentTimeMillis() - startTime11;
    System.out.println("arrList1 Elapsed Time:" + total11);

    // ArrayList 1000.
    ArrayList<Double> arrList2 = new ArrayList<Double>();
    for (int i = 0; i < size2; i++)
    {
        arrList2.add(Math.random());
    }
    long startTime12 = System.currentTimeMillis();
    for (int i = 0; i < arrList2.size(); i++);
    long total12 = System.currentTimeMillis() - startTime12;
    System.out.println("arrList2 Elapsed Time:" + total12);

    // ArrayList 10000.
    ArrayList<Double> arrList3 = new ArrayList<Double>();
    for (int i = 0; i < size3; i++)
    {
        arrList3.add(Math.random());
    }

    long startTime13 = System.currentTimeMillis();
    for (int i = 0; i < arrList3.size(); i++);
    long total13 = System.currentTimeMillis() - startTime13;
    System.out.println("arrList3 Elapsed Time:" + total13);

    // ArrayList 100000.
    ArrayList<Double> arrList4 = new ArrayList<Double>();
    for (int i = 0; i < size4; i++)
    {
        arrList4.add(Math.random());
    }

    long startTime14 = System.currentTimeMillis();
    for (int i = 0; i < arrList4.size(); i++);
    long total14 = System.currentTimeMillis() - startTime14;
    System.out.println("arrList4 Elapsed Time:" + total14);

    // ArrayList 1000000.
    ArrayList<Double> arrList5 = new ArrayList<Double>();
    for (int i = 0; i < size5; i++)
    {
        arrList5.add(Math.random());
    }

    long startTime15 = System.currentTimeMillis();
    for (int i = 0; i < arrList5.size(); i++);
    long total15 = System.currentTimeMillis() - startTime15;
    System.out.println("arrList5 Elapsed Time:" + total15);
+3  A: 

You aren't doing anything in your loops, so they are likely being optimized out by the compiler and not being executed at all.

ColinD
What do you recommend I do? The directions were just to iterate through the loop and check how long it took.
Slizzard
Maybe the intent was that you discover this! I don't really know. You can also do some arbitrary but relatively fast computation in there to force the loops to execute (for example, compute the sum of the contents of each list). Obviously, the time for that computation would be included in the result, but there isn't really any way to determine how long it takes to JUST iterate through a list.
ColinD
+3  A: 

Not sure why some of them complete before others unless it has to do with number caching by the JVM, but I can say that using nanoTime will yield better metrics. http://ideone.com/ktCXP

Mondain
GC could explain some counter-intuitive results. If the GC kicks in while doing the 10,000 element iteration, then it might push the time to 2.000001ms. The 1000000 iteration (without GC) might take 0.9999999999ms. The tests might have been run on a multi-user computer system. Because of background noise, to get meaningful results, you need repeated measurements and statistical analysis.
emory
+2  A: 

Yeah, your loops are being optimized out--be careful Java is smart and may even optimize out things like:

{
    int a;
    for(int b=0;b<10000;b++)
        a++;
}

because it can know that you never use the value in a!

Also, please change the first part of your program to this:

int[] sizes = new int[] {100,1000,10000,100000,1000000};

and then use in your code:

for(int size:sizes) {
    ArrayList<Double> arrList = new ArrayList<Double>();
    for (int i = 0; i < size; i++)
    {
        arrList.add(Math.random());
    }
    long startTime = System.currentTimeMillis();

    double sum=0;

    for (int i = 0; i < arrList.size(); i++)
        sum=sum+arrList.get(i); // may need casting here

    long total = System.currentTimeMillis() - startTime;
    System.out.println("Loop for size:"+size+"  Elapsed Time:" + total);
}

The way you copied and pasted and edited all that crap hurts me all over. It's so wrong to even do it once--even just as a first pass.

If you ever copy and paste when you are coding, there is a 95% chance you are doing it wrong. If you ever have variables named with numbers at the end, you're almost certainly doing it wrong. If you find yourself editing in some kind of a pattern, like changing 1's to 2's on the end of a variable in sets of code that look similar you are absolutely positively beyond a shadow of a doubt doing it wrong.

You'll see this when you try to fix the code you wrote vs doing it the way I specified above.

---Another edit--- Modified it to have it do something with the resultant value--in this case it sums them up as it iterates over it. Sorry about before, I'm too quick to scan over thing, but this should at least have something happen.

Bill K
It is not obvious to me that the semi-colon was a mistake. The assignment was to measure the iteration time. Without the semi-colon, the program is measuring iteration, assignment, and arithmetic time.
emory
Oh crap, you're right. Sorry. I'll patch it to something that will work.
Bill K
+1  A: 
Bert F
javac is not optimizing away the loop. At runtime, the byte code may be compiled to native. At this point, it may be optimized away. I have no idea how you would test for that, other than by measuring execution time.
emory
Ah - I didn't think about the Just-In-Time compiler. Excellent point. Still, I believe the results Mondain and I got from using `nanoTime()` instead of `currentTimeMillis()` seem to suggest the loops are there.
Bert F
A: 

The goal of the exercise may be to observe the vagaries of micro-benchmarking, as discussed in Anatomy of a flawed microbenchmark and Writing a Java Micro-benchmark.

trashgod
+1  A: 

U should

  1. Write a method to the iteration. This method can take the size as a parameter. Create the proper sized list and time its iteration. As Bill K sez copy/paste bad, methods good.
  2. Measure in nanoseconds (link text)
  3. Statistical analysis. Let Y be the iteration time. Let X be the size of the array. Make many measurements of Y for different values of X. Graph X versus Y. What is the shape of this graph? You might want to run the simple linear regression Y=b0+b1X to test your assumption that iteration time is correlated with size of array.
emory
+3  A: 

Since this is a homework exercise, here are some hints about things you need to take into account when writing Java micro-benchmarks (like this):

1 - The Java compilers are smart. If they figure out that some part of your code does not contribute to the visible results of that code, they will optimize it away.

To prevent this from distorting your benchmark results, you typically need to ensure that your benchmarking loops affect the outcome of the program in some way. For example, have them do some cheap calculation that contributes to a result and print out that result at the end of the becnchmark.

2 - JVMs take a time to "warm up". When you start an application in a typical Sun JVM (and others), your code is initially interpreted by a bytecode interpreter. After a bit, the JVM decides it is worth JIT compiling your code to native code ... and one that has happened your code starts running a lot faster. (But during the JIT compilation, it may appear to run really slowly.)

To prevent this from distorting your benchmark results, you typically need to arrange that your loops (in fact your entire benchmark) are run multiple times. I typically, make the entire benchmark a method, and run it a number of times in a loop. I then look at the numbers to see the point at which they stabilize, and only pay attention to the numbers after that point.

3 - Benchmarks that involve repeated creation of objects / arrays have a tendency to trigger garbage collection at "random" points. If the GC runs (depending on the GC settings) in the middle of a benchmark loop, the time taken to run that loop may be inflated. Depending on what it is you are trying to measure, this may be give you anomalous results.

If you are trying to exclude garbage collection overheads, you could set the initial JVM heap size to a very large number so that the GC never has to run. Alternatively, you could manually remove the anomalous results by eye, or by correlating with GC runs reported in the GC log file.

OTOH, if you want to include GC overheads, use a typical heap size, run the loops a large number of time and calculate the average time per loop.

Stephen C