views:

210

answers:

10

As multi-processor and multi-core computers become more and more ubiquitous, is simply firing off a new thread a (relatively) simple and painless way of simplifying code? For instance, in a current personal project, I have a network server listening on a port. Since this is just a personal project, it's just a desktop app, with a GUI integrated into it for configuration. So, the app reads something like this:

Main()
    Read configuration
    Start listener thread
    Run GUI

Listener Thread
    While the app is running
        Wait for a new connection
        Run a client thread for the new connection

Client Thread
    Write synchronously
    Read synchronously
    ad inifinitum, or till they disconnect

This approach means that while I have to worry about alot of locking, with the potential issues that involves, I avoid alot of spaghetti code from assynchronous calls, etc.

A slightly more insidious version of this came up today when I was working on the startup code. The startup was quick, but it was using lazy loading for alot of the configuration, which meant that while startup was quick, actually connecting to and using the service was difficult because of the lag while it loaded different sections (this was actually measurable in real time, up to 3-10 seconds sometimes). So I moved to a different strategy, on startup, loop through everything and force the lazy loading to kick in... but this made it start prohibitively slow; get up, go get a coffee slow. Final solution: throw the loop into a seperate thread with feedback in the system tray while it's still loading.

Is this "Meh, throw it in another thread, it'll be fine" attitude ok? At what point do you start getting diminishing returns and/or even reduced performance?

+18  A: 

Multithreading does a lot of things, but I don't think "simplification" is ever one of them.

mgroves
If you have to do two or more things concurrently, the only alternative would be to entangle them in some way in sequential code. So in that sense multithreading simplifies the code, though it still ain't simple.
starblue
Which is my argument for thread+synchronous calls for ie. implementing a synchronous network protocol. Yes, locks and synchronization aren't simple topics... I think the word I was looking for when I wrote this question was probably "consolidating" or something similar.
Matthew Scharley
+7  A: 

It's a great way to introduce bugs into code.

Using multiple threads properly is not easy. It should not be attempted by new developers.

John Saunders
Why the downvote. Do you need synchronization explained, deadlock?
John Saunders
For the record, it wasn't me. And personally, I don't consider myself a new developer, though that could possibly be personal pride jumping in. I do understand deadlocks and synchronization, if not always the tools used to achieve/avoid them.
Matthew Scharley
I wasn't talking about you as a new developer. Your question seemed to be about making the "add new thread" process ubiquitous, which means it would also be used by new developers.
John Saunders
+4  A: 

In my opinion, multi-threaded programming is pretty high up on the difficulty (and complexity) scale, along with memory management. To me, the "Meh, throw it in another thread, it'll be fine" attitude is a bit too casual. Think long and hard you must, before forking threads you do.

Alan
+2  A: 

This gives you the extra job of debugging race conditions, and handling locks and sycronisation issues.

I would not use this unless there was a real need.

Shiraz Bhaiji
Lock more than necessary, then whittle them away where they cause performance issues after taking a good long look at how you can?
Matthew Scharley
A: 

I think your have no choice but to deal with threads especially with networking and concurrent connections. Do threads make code simpler? I don't think so. But without them how would you program a server that can handle more than 1 client at the same time?

Peter D
Yes, but you don't add them in order to make code simpler. You add them when you _need_ them, and then you do so carefully.
John Saunders
Also, using purely asynchronous calls you CAN achieve this, atleast as far as your own code is concerned. Of course, these calls still use seperate threads behind the scenes...
Matthew Scharley
And you still have synchronization issues. Also, you'd be surprised but a lot of people don't really get async code.
John Saunders
Heck, the way event handlers get called in C# once you throw threads into the mix still makes my mind go numb sometimes... and you are right, async calls are a great way to trip up and think 'yay, I'm not using threads, I don't need to worry'. Or even be completely oblivious to the need.
Matthew Scharley
+4  A: 

No.

Plainly and simply, multithreading increases complexity and is a nearly trivial way to add bugs to code. There are concurrency issues such as synchronization, deadlock, race conditions, and priority inversion to name a few.

Secondly, the performance gains are not automatic. Recently, there was an excellent article in MSDN Magazine along these lines. The salient details are that a certain operation was taking 46 seconds per ten iterations coded as a single-threaded operation. The author parallelized the operation naively (one thread per four cores) and the operation dropped to 30 seconds per ten iterations. Sounds great until you take into consideration that the operation now eats 300% more processing power but only experienced a 34% gain in efficiency. It's not worth consuming all available processing power for a gain like that.

Jason
I'd give you another +1 if I could for the link on priority inversion... never heard of that particular one before.
Matthew Scharley
+2  A: 

Read up on Amdahl's law, best summarized by "The speedup of a program using multiple processors in parallel computing is limited by the time needed for the sequential fraction of the program."

As it turns out, if only a small part of your app can run in parallel you won't get much gains, but potentially many hard-to-debug bugs.

Giovanni Galbo
More interesting info by following a few links: http://www.cilk.com/multicore-blog/bid/5365/What-the-is-Parallelism-Anyhow
Matthew Scharley
+1  A: 

I don't mean to be flip but what's in that configuration file that it takes so long to load? That's the origin of your problem, right?

Before spawning another thread to handle it, perhaps it can be parred down? Reduced, perhaps put in another data format that would be quicker, etc?

How often does it change? Is it something you can parse once at the beginning of the day and put the variables in shared memory so subsequent runs of your main program can just attach and get the needed values from there?

Duck
Compiling a bunch of short C# scripts. Yes, it's possible to cache them between runs, and in the long run I probably will. But for the sake of argument, and my own brain, at the moment I'm not. Currently they don't have any guaranteed unique data attached to them to connect them to the rest of the data associated with them. As for how often they change, usually atleast once per run.
Matthew Scharley
Two thoughts then. (1) That is probably worth the added complexity of additional threads. (2) You might as well start on some caching scheme now. While there might be some overlap between the short and long term solutions you will probably end up doing most of the work twice.
Duck
For what it's worth, I ended up managing to ThreadPool each of the compilations so now the whole configuration is loading in seconds instead of over a minute. Caching will probably still be on the agenda at some point... but it's no longer a 'need to do' issue by any stretch.
Matthew Scharley
Sounds good. Glad to hear things worked out.
Duck
+1  A: 

While I agree with everyone else here in saying that multithreading does not simplify code, it can be used to greatly simplify the user experience of your application.

Consider an application that has a lot of interactive widgets (I am currently developing one where this helps) - in the workflow of my application, a user can "build" the current project they are working on. This requires disabling the interactive widgets my application presents to the user and presenting a dialog with a indeterminate progress bar and a friendly "please wait" message.

The "build" occurs on a background thread; if it were to happen on the UI thread it would make the user experience less enjoyable - after all, it's no fun not being able to tell whether or not you are able to click on a widget in an application while a background task is running (cough, Visual Studio). Not to say that VS doesn't use background threads, I'm just saying their user experience could use some improvement. But I digress.

The one thing I take issue with in the title of your post is that you think of firing off threads when you need to perform tasks - I generally prefer to reuse threads - in .NET, I generally favor using the system thread pool over creating a new thread each time I want to do something, for the sake of performance.

unforgiven3
Using the thread pool, or otherwise, you're still creating a new thread, so I don't really see the difference, other than that .NET is managing the creation and running of it, when and how it sees fit.
Matthew Scharley
No - per MSDN: "There is only one ThreadPool object per process. The thread pool is created the first time you call ThreadPool.QueueUserWorkItem, or when a timer or registered wait operation queues a callback method" - there is only a one-time cost in creating the thread pool. After that, the pool of threads is warmed up and lives until your process ends - you simply submit asynchronous jobs to it and it takes care of the rest, as opposed to you creating a thread (lots of overhead) each time you want to do something.
unforgiven3
I think he meant the creation and running of the thread, not of the pool.
John Saunders
+1  A: 

I'm going to provide some balance against the unanimous "no".

DISCLAIMER: Yes, threads are complicated and can cause a whole bunch of problems. Everyone else has pointed this out.

From experience, a sequence of blocking reads/writes to a socket (which requires a separate thead) is much simpler than non-blocking ones. With blocking calls, you can tell the state of the connection just by looking at where you are in the function. With non-blocking calls, you need a bunch of variables to record the state of the connection, and check and modify them every time you interact with the connection. With blocking calls, you can just say "read the next X bytes" or "read until you find X" and it will actually do it (or fail). With non-blocking calls, you have to deal with fragmented data which usually requires keeping temporary buffers and filling them as necessary. You also end up checking if you've received enough data every time you receive little more. Plus you have to keep a list of open connections and handle unexpected closes for all of them.

It doesn't get much simpler than this:

void WorkerThreadMain(Connection connection) {
    Request request = ReadRequest(connection);
    if(!request) return;
    Reply reply = ProcessRequest(request);
    if(!connection.isOpen) return;
    SendReply(reply, connection);
    connection.close();
}

I'd like to note that this "listener spawns off a worker thread per connection" pattern is how web servers are designed, and I assume it's how a lot of request/response soft of server applications are designed.

So in conclusion, I have experienced the asynchronous socket spaghetti code you mentioned, and spawning off worker threads for every connection ended up being a good solution. Having said all this, throwing threads at a problem should usually be your last resort.

Tom Dalling