views:

260

answers:

9

While i like the intellectual challenge coming from the design of multicore systems i realize that most of them were just unnecessary premature optimization.

But on the other hand usually all systems have some performance need and refactoring it later into multithreading safe operations is hard or even just economically not possible because it would be a complete rewrite with another algorithm.

What is your way to keep a balance between optimization and getting things done?

+2  A: 

Introducing Threading does not automatically increase performance.

fuzzy lollipop
Indeed. If you look at the history of web servers, removing threading in favor for multiplexed I/O was one of the first great milestone in improving performance.
slebetman
Here's some proof as to what slebetman was referring to: http://www.kegel.com/c10k.html
Polaris878
slebetman, can you link to that? I'm curious now.
Paul Nathan
+3  A: 

If you are doing anything somewhat complex that is multithreaded you better think about it/design it well beforehand. Otherwise, your program will either be a complete disaster or will work perfectly MOST of the time and do crazy stuff the other part of the time. It is hard to design something provably correct with multithreading, however it is extremely important. So no, I don't think good multithreaded design is premature optimization.

Polaris878
+1  A: 

Maybe the good thing is designing systems with some characteristics so if you want to introduce multithreading you could do it gracefully.

I'm not sure about what are that characteristics, but one analog example comes to my mind: scaling. If you design for small operations that an stateless system can perform you will be able to scale more naturally.

That kind of things seems to me important.

If it's designing for multithreading... then it's important a premature approach.

If it's simply ensuring some characteristics that allow scaling or multithreading in a future: then it's not so important :)

EDIT oops, I read it again: premature optimization?

Optimization: I don't thing it's good until you have the system working (and without vices coming from trying to optimize things). Do a clean design, something that is flexible and simple the most. Next you can optimize when you see what's really needed.

helios
+1  A: 

I believe threading also obeys the laws of optimization.
That is, don't waste your time making quick operations parallel.
Instead, apply threads to tasks that take a long time to execute.

Of course, if systems start having 1000+ cores, then this answer might become obsolete and need to be revised. But then again, if you're going to "get things done", then you'll definitively want to ship your product before then.

luiscubal
I agree with this... use threading for things like I/O or really heavy computation. You don't want to block a main thread while it waits on some heavy I/O
Polaris878
-1: "if systems start having 1000+ cores" and if system have more than one core. "don't waste your time making quick operations parallel": what means "quick"? Why is parallelism a waste of time? How else would you take advantage of todays processors that come as 8 cores and more even in the consumer market?
Alex
The issue I'm mentioning here is the existence of bottlenecks. There are certainly parts of the code which take longer to execute than others. THOSE are the parts that really matter to optimize, and those are the parts that could probably use paralellism. If you could use parallelism to speed a common yet long operation 10x, then go for it. By "waste of time", I mean as in "waste of development time". Also, last time I checked, most computers came in 2 cores, even 4 cores are relatively rare, 8 cores are truly unusual.
luiscubal
"Quick" means that users don't even notice it. If an operation already looks instantaneous, why bother to make it faster?And, regarding your 8 cores, I would be interested in having access to statistics with current average number of cores per user, from a reliable source. I'm not saying parallelism is useless. I'm saying parallelism is useless *for short operations*. For long operations, it is useful. If an operation is *quick*, then making it parallel won't have any advantage. Not even 8 core users will see any difference.
luiscubal
:))) Help!!! Don't shoot!!! I think we're simply coming from two different backgrounds. I agree with you regarding a funky little end-user UI. I completely disagree when it comes to server applications which is more my background. (Our minimum server spec is 8 core up to 16 core).
Alex
+7  A: 

If you follow the Pipeline and Map-Reduce design patterns, that should be enough.
Decompose things so that you could run in an OS-level multi-processing pipeline.

You can then actually run in an actual pipeline. No additional work. OS handles everything. Huge speedup opportunity.

You could also switch to threads. A little work. OS handles some parts of this, thread libraries handle the rest. However, since you were thinking "process" at design time, your threads don't have any confusing data sharing issues. Big win for a little thinking.

S.Lott
Exactly right - don't design "for multiple threads". Make high-level abstraction for where data can be processed in parallel.
Mark Bessey
+2  A: 

They say that days of coding can save hours of design.

Not all problems nor frameworks are multi-threadable. The libraries you depend upon, for example, may not be thread-safe. A lot of processes are naturally sequential, and cannot be broken into parallelizable parts.

And going multi-threaded/multi-processed is just one way to parallelize. You can also use, for example, asynchronous IO.

In my experience, going asynchronous from a single thread is far saner than going multi-threaded. But then, the programs I write solve different problems to, well, pretty much everyone else.

Will
+1  A: 

I would never consider designing for multithreading in a application purely for speculative performance considerations. That's because with a few techniques that are good for any application, it is easy to make an operation multi-threaded later. The techniues I'm thinking of are:

  • Hard const contracts
    • In C++, you can mark a method as const, meaning it doesn't change the value of an instance variables. You can also mark an input parameter to a method as const, meaning that only const methods may be called on that parameter. With these two techniques (and by not using "tricks" to get around these compiler enforcements), you can pare down the operations that need to be multi-threading aware.
  • Dependency Inversion
    • This is a general technique where any external objects needed by an object are passed to it at construction/initialization time or as part of the method signature for the particular method. With this technique, it is 100% clear which objects could possibly be changed by an operation (the non-const instance variables plus the non-const parameters to the operation.) Knowing that, you know the scope of the non-functional aspects of an operation and you can add mutexes, etc to objects that could be shared between parallel operations. You can then design your parallelism to be correct and efficient.
  • Favor functional over procedural
    • Ironically, this means, don't prematurely optimize. Make value objects immutable. For example, in C#, strings are immutable, meaning that any operations on them return new instances of string objects, not modified instances of the existing string. The only objects that shouldn't be immutable are unbounded arrays or objects that contain unbounded arrays, if those arrays are likely to be modified often. I'd argue that immutable objects are easier to understand. Many programmers were taught procedural techniques, so this is somewhat foreign to us, but when you start thinking in immutable terms, horrible aspects of precedural programming such as order of operation dependence and side-effects go away. Those aspects are even more horrible in multi-threaded programming, so using a functional style in class design helps in many ways. As machines grow faster, the higher cost of immutable objects becomes easier and easier to justify. Today, it's a balance.
David Gladfelter
A: 

Threads exist to make the employment of multiple agents easier to program.

  • If the agents are users, like if you have a thread-per-user, they make it easier to write the program. This is not a performance issue, it is an ease-of-writing issue.

  • If the agents are I/O devices, they make it easy to write a program that does I/O in parallel. This may or may not be done for performance.

  • If the agents are CPU cores, they make it easy to write programs that get multiple cores cranking in parallel. That is when threads correlate with performance.

In other words, if you think threads==parallelism==performance, that's only hitting one of the uses of threads.

Mike Dunlavey
A: 

There are three basic design choices: Sync, Async, or Sync + multi threaded. Pick one or more than one if your an insane genious.

Yes you need to understand acceptable performance expectations of your customers during the design phase of your application to be able to make the right choices up front. For any non-trivial project it can be quite dangerous and time consuming to treat high level performance expecations as an afterthought.

If sync does not meet customer requirements:

CPU limited systems require selection of multi-thread/process

IO limited systems (most common) can often go either Async or MT.

For IO limited leveraging technologies such as state threads can allow you to have your cake and eat it too. (Sync design /w async execution)

Einstein