views:

265

answers:

2

When it comes to virtualization, I have been deliberating on the relationship between the physical cores and the virtual cores, especially in how it effects applications employing parallelism. For example, in a VM scenario, if there are less physical cores than there are virtual cores, if that's possible, what's the effect or limits placed on the application's parallel processing? I'm asking, because in my environment, it's not disclosed as to what the physical architecture is. Is there still much advantage to parallelizing if the application lives on a dual core VM hosted on a single core physical machine?

+4  A: 

Is there still much advantage to parallelizing if the application lives on a dual core VM hosted on a single core physical machine?

Always.

The OS-level parallel processing (i.e., Linux pipelines) will improve performance dramatically, irrespective of how many cores -- real or virtual -- you have.

Indeed, you have to create fairly contrived problems or really dumb solutions to not see performance improvements from simply breaking a big problem into lots of smaller problems along a pipeline.

Once you've got a pipelined solution, and it ties up 100% of your virtual resources, you have something you can measure.

Start trying different variations on logical and physical resources.

But only after you have an OS-level pipeline that uses up every available resource. Until then, you've got fundamental work to do just creating a pipeline solution.

S.Lott
There are actually some algorithms that are inherently single threaded and cannot be sped up by breaking them up into smaller chunks. In fact, in these cases, you'll often see a slowdown. Knowing when to attempt to parallelize your code and when not to is the key.
jer
@jer: While true, a practical application often involves combining the core algorithm with input, output, processing and other stuff that can be nicely pipelined. While *an* algorithm may not parallelize, an application almost always pipelines nicely with some measurable speedup.
S.Lott
Sure, all I was trying to say is throwing everything in a pipeline is not necessarily a wise idea. There's no sense in slowing down some parts of your program just because it all averages out. But I know what you are saying, and you are indeed mostly correct.
jer
@jer: What's wrong with "slowing down some parts of your program just because it all averages out"? If the average is an actual improvement, what's the problem?
S.Lott
The problem may be that that particular spot that slows down creates a pause or visual sluggishness -- something the user can see or feel. Again, I'm not advocating not doing it because of this, just to think before you start abusing parallelism. Use it where it makes sense only, and not everywhere.
jer
+1  A: 

Since you included the F# tag, and you're interested in parallel performance, I'll assume that you're using F# asynchronous IO, hence threads never block, they just swap between CPU bound tasks.

In this case it's ideal to have the same number of threads as the number of virtual cores (at least based on my experiments with F# in Ubuntu under Virtualbox hosted by Windows 7). Having more threads than that decreases performance slightly, having less decreases performance quite a bit.

Also, having more virtual cores than physical cores decreases performance a little. But if this is something you can't control, just make sure you have an active worker thread for each virtual core.

RD1