tags:

views:

118

answers:

4
A: 

There are at least two possible error sources. Can you try to differentiate between 'tic/toc' and 'fprintf' by just looking at the computed values without formatting them.

I don't understand the braces around 'toc' but they shouldn't do any harm.

zellus
Yes, the averageX really does get the large value. So, the problem is not in the fprintf functions. Braces are just remains of something I've tried myself, after which I forgot to delete them. They do no harm.
Rook
What happens if you increase the number of iterations (maxcount). What platform are you working on?
zellus
@Rook: You may also have a look at the following post. Then end of the post as well as the comments are interesting. http://undocumentedmatlab.com/blog/tic-toc-undocumented-option/
zellus
xp sp2. Interesting, increasing maxcount by a factor of 100 (all three of them) avoids this behaviour. Is it possible that maybe a second program was starting execution before the first one finished, and that maybe MATLAB didn't have enough time to "clean out" the old values? Just guessing here.
Rook
@Rook: Could you try a 'pause' in front of the tic, in order to prove your assumption? With the 'original' values for 'maxcount'.
zellus
A: 

That is indeed crazy. Don't know what could cause it, and was unable to reproduce on my own Matlab R2010a copy over several runs, invoked by name or via F5.

Here's an idea for debugging it.

When using tic/toc inside a script or function, use the "tstart = tic" form that captures the output. This makes it safe to use nested tic/toc calls (e.g. inside called functions), and lets you hold on to multiple start and elapsed times and examine them programmatically.

t0 = tic;
% ... do some work ...
te = toc(t0); % "te" for "time elapsed"

You can use different "t0_label" suffixes for each of the tic and toc returns, or store them in a vector, so you preserve them until the end of your script.

t0_uninit = tic;
% ... do the uninitialized-array test ...
te_uninit = toc(t0_uninit);

t0_prealloc = tic;
% ... test the preallocated array ...
te_prealloc = toc(t0_prealloc);

Have the script break in to the debugger when it finds one of the large values.

if any([te_uninit te_prealloc te_vector] > 5)
    keyboard
end

Then you can examine the workspace and the return values from tic, which might provide some clues.


EDIT: You could also try testing tic() on its own to see if there's something odd with your system clock, or whatever tic/toc is calling. tic()'s return value looks like a native timestamp of some sort. Try calling it many times in a row and comparing the subsequent values. If it ever goes backwards, that would be surprising.

function test_tic

t0 = tic;
for i = 1:1000000
    t1 = tic;
    if t1 <= t0
        fprintf('tic went backwards: %s to %s\n', num2str(t0), num2str(t1));
    end
    t0 = t1;
end

On Matlab R2010b (prerelease), which has int64 math, you can reproduce a similar ridiculous toc result by jiggering the reference tic value to be "in the future". Looks like an int rollover effect, as suggested by gary comtois.

>> t0 = tic; toc(t0+999999)
Elapsed time is 6148914691.236258 seconds.

This suggests that if there were some jitter in the timer that toc were using, you might get rollover if it occurs while you're timing very short operations. (I assume toc() internally does something like tic() to get a value to compare the input to.) Increasing the number of iterations could make the effect go away because a small amount of clock jitter would be less significant as part of longer tic/toc periods. Would also explain why you don't see this in your non-preallocated test, which takes longer.


UPDATE: I was able to reproduce this behavior. I was working on some unrelated code and found that on one particular desktop with a CPU model we haven't used before, a Core 2 Q8400 2.66GHz quad core, tic was giving inaccurate results. Looks like a system-dependent bug in tic/toc.

On this particular machine, tic/toc will regularly report bizarrely high values like yours.

>> for i = 1:50000; t0 = tic; te = toc(t0); if te > 1; fprintf('elapsed: %.9f\n', te); end; end
elapsed: 6934787980.471930500
elapsed: 6934787980.471931500
elapsed: 6934787980.471899000
>> for i = 1:50000; t0 = tic; te = toc(t0); if te > 1; fprintf('elapsed: %.9f\n', te); end; end
>> for i = 1:50000; t0 = tic; te = toc(t0); if te > 1; fprintf('elapsed: %.9f\n', te); end; end
elapsed: 6934787980.471928600
elapsed: 6934787980.471913300
>> 

It goes past that. On this machine, tic/toc will regularly under-report elapsed time for operations, especially for low CPU usage tasks.

>> t0 = tic; c0 = clock; pause(4); toc(t0); fprintf('Wall    time is %.6f seconds.\n', etime(clock, c0));
Elapsed time is 0.183467 seconds.
Wall    time is 4.000000 seconds.

So it looks like this is a bug in tic/toc that is related to particular CPU models (or something else specific to the system configuration). I've reported the bug to MathWorks.

This means that tic/toc is probably giving you inaccurate results even when it doesn't produce those insanely large numbers. As a workaround, on this machine, use etime() instead, and time only longer chunks of work to compensate for etime's lower resolution. You could wrap it in your own tick/tock functions that use the for i=1:50000 test to detect when tic is broken on the current machine, use tic/toc normally, and have them warn and fall back to using etime() on broken-tic systems.

Andrew Janke
Although still not sure as to the cause, it seems that increasing ... uhmm; if you could read my comment to zellus as not to repeat it here. I'm off to try this as well now.
Rook
+1  A: 

Here's my theory about what might be happening, based on these two pieces of data I found:

  • There is a function maxNumCompThreads which controls the maximum number of computational threads used by MATLAB to perform tasks. Quoting the documentation:

    By default, MATLAB makes use of the multithreading capabilities of the computer on which it is running.

    Which leads me to think that perhaps multiple copies of your script are running at the same time.

  • This newsgroup thread discusses a bug in an older version of MATLAB (R14) "in the way that MATLAB accelerates M-code with global structure variables", which it appears the TIC/TOC functions may use. The solution there was to disable the accelerator using the undocumented FEATURE function:

    feature accel off
    

Putting these two things together, I'm wondering if the multiple versions of your script that are running in the workspace may be simultaneously resetting global variables used by the TIC/TOC functions and screwing one another up. Maybe this isn't a problem when converting your script to a function as Amro did since this would separate the workspaces that the two programs are running in (i.e. they wouldn't both be running in the main workspace).

This could also explain the exceedingly large numbers you get. As gary and Andrew have pointed out, these numbers appear to be due to an integer roll-over effect (i.e. an integer overflow) whereby the starting time (from TIC) is larger than the ending time (from TOC). This would result in a huge number that is still positive because TIC/TOC are internally using unsigned 64-bit integers as time measures. Consider the following possible scenario with two scripts running at the same time on different threads:

  1. The first thread calls TIC, initializing a global variable to a starting time measure (i.e. the current time).
  2. The first thread then calls TOC, and the immediate action the TOC function is likely to make is to get the current time measure.
  3. The second thread calls TIC, resetting the global starting time measure to the current time, which is later than the time just measured by the TOC function for the first thread.
  4. The TOC function for the first thread accesses the global starting time measure to get the difference between it and the measure it previously took. This difference would result in a negative number, except that the time measures are unsigned integers. This results in integer overflow, giving a huge positive number for the time difference.

So, how might you avoid this problem? Changing your scripts to functions like Amro did is probably the best choice, as that seems to circumvent the problem and keeps the workspace from becoming cluttered. An alternative work-around you could try is to set the maximum number of computational threads to one:

maxNumCompThreads(1);

This should keep multiple copies of your script from running at the same time in the main workspace.

gnovice
I considered the effect of script vs function, and also the effect of JIT when running things live on command line vs running a saved m-file. Still I was unable to reproduce the problem (I'm running the same MATLAB version and operating system as the OP).. This could be a race condition bug indeed
Amro
@Amro: Unfortunately, I don't have the same MATLAB version as the OP, so I can't try to reproduce it. I *can* say that I haven't been able to reproduce it in R2009a. Without R2010a, the above is my best guess. ;)
gnovice
A: 

Here is a hypothesis which is testable. Matlab's tic()/toc() have to be using some high-resolution timer. On Windows, because their return value looks like clock cycles, I think they're using the Win32 QueryPerformanceCounter() call, or maybe something else hitting the CPU's RDTSC time stamp counter. These apparently have glitches on some multiprocessor systems, mentioned in the linked articles. Perhaps your machine is one of those, getting different results if the Matlab process is moved from core to core by the process scheduler.

http://msdn.microsoft.com/en-us/library/ms644904(VS.85).aspx

http://www.virtualdub.org/blog/pivot/entry.php?id=106

This would be hardware and system configuration dependent, which would explain why other posters haven't been able to reproduce it.

Try using Windows Task Manager to set the affinity on your Matlab.exe process to a single CPU. (On the Processes tab, right-click MATLAB.exe, "Set affinity...", un-check all but CPU 0.) If the crazy timing goes away while affinity is set, looks like you found the cause.

Regardless, the workaround looks like to just increase maxcount so you're timing longer pieces of work, and the noise you're apparently getting in tic()/toc() is small compared to the measured value. (You don't want to have to muck around with CPU affinity; Matlab is supposed to be easy to run.) If there's a problem in there that's causing int overflow, the other small positive numbers are a bit suspect too. Besides, hi-res timing in a high level language like Matlab is a bit problematic. Timing workloads down to a couple hundred microseconds subjects them to noise from other transient conditions in your machine's state.

Andrew Janke

related questions