When looping through a very large language (say in C# 1.0), which is, for example, 5,000,000 loop iterations, what problems may this cause (performance/GC etc). This is only for the loop body.
The GC operates outside of your application, when you hit a threshold it'll automatically clean up what it can. Anything defined inside the scope of the loop body and not maintained at a lower scope will be marked for finalizing/collection whenever the GC feels like it's appropriate.
Other than that, a large loop (5 million) has not Real performance issues (at least not without more information). It'll just iterate and do what it does.
As everyone is saying, what happens in the body of the loop is the important part (and if that's your question, please post code/pseduo code).
I've written applications (mostly test drivers) which allocate, use, and dispose objects within loops that iterate millions of times. I believe the largest loop I've used was at least 15 million cycles.
The end state of the machine was the same as the start state. "Good performance" is subjective, but I was pleased with the results.
FWIW, there are processes that loop indefinitely within Windows, demonstrating that the number of iterations can be considered irrelevant in many cases. It's what happens inside the loop that counts, but modern memory allocation/disposition routines (whether manual or GC-automated) are very capable of keeping up with a sensible loop.
Of course, you could write a busy loop to purposely detract from the performance of the system, such as opening millions of handles to unmanaged objects and never disposing of those handles in .Net.
EDIT: after seeing Jon's IL/assembly, it appears that the IL compiler does not optimize away a loop, even if it's empty, so I have removed that theory.
In a large loop, try to avoid allocating more objects than you need, or the garbage collector will jump in repeatedly and pause the loop. Normally the garbage collector can jump in and work when your application is waiting for something else, so it's not even noticable, but in a long loop there is no idle time that the garbage collector can take advantage of.
If you are accessing a large array in the loop, try to access it in a linear fashion. That way the data that you work with is in the memory cache most of the time. If you for example loop over a two dimensional array and have the first index as inner loop, you will be jumping back and forth in the array, causing a lot more cache misses.
And of course, using +=
to create a string in a loop performs very badly, as you will be shuffling larger and larger strings each iterations. For example creating a 10000 character string by adding one character at a time takes 40 ms on my computer, but creating a 100000 character string doesn't take 400 ms, but 8000 ms.
In addition to the other answers posted, you should avoid putting a long-running loop in the UI thread / main thread of a GUI app. This doesn't matter much if you're writing a console app, but it does if you're writing a WinForms or WPF app.
Windows apps' UI responsiveness is driven by message processing, and if you get into a long-running for loop without "coming up for air" to process window messages every now and then, your UI will freeze and your user will think your app has crashed.
You can add message processing to the body of your long-running loop (Application.ProcessMessages in Delphi/C++ VCL) but that leads to new problems with reentrancy - user selecting a menu item while in your for loop leading to recursion.
Your best bet for a GUI app is to put long-running loops like this in a background thread so that your UI will remain responsive.
The following method does nothing 5,000,000 times:
public static void TestTest()
{
for(int i = 0; i != 5000000; ++i);
}
Compiling produces the following IL:
.method public hidebysig static void TestTest() cil managed
{
.maxstack 2
.locals init (
[0] int32 num)
L_0000: ldc.i4.0
L_0001: stloc.0
L_0002: br.s L_0008
L_0004: ldloc.0
L_0005: ldc.i4.1
L_0006: add
L_0007: stloc.0
L_0008: ldloc.0
L_0009: ldc.i4 0x4c4b40
L_000e: bne.un.s L_0004
L_0010: ret
}
Running on a Pentium, I find it's JITted to:
00000000 push ebp
00000001 mov ebp,esp
00000003 push eax
00000004 cmp dword ptr ds:[002030E4h],0
0000000b je 00000012
0000000d call 6CE7A839
00000012 xor edx,edx
00000014 mov dword ptr [ebp-4],edx
00000017 xor edx,edx
00000019 mov dword ptr [ebp-4],edx
0000001c nop
0000001d jmp 00000022
0000001f inc dword ptr [ebp-4]
00000022 cmp dword ptr [ebp-4],4C4B40h
00000029 jne 0000001F
0000002b nop
0000002c mov esp,ebp
0000002e pop ebp
0000002f ret
Mostly this just spends its time incrementing the value, comparing it with 5000000 and then jumping a few bytes back if it isn't equal. Most likely all of this will be in the L1 cache. It could perhaps be a bit more efficient than this if hand-coded assembly, though of course the real way that it should be optimised is to ignore the whole thing, but barring that this is pretty much as good as you could expect with any language.
All in all, it's neither the size of the language, nor the loop itself that matter.