Why is it that threads were never included as part of the C++ standard originally? Did they not exist when C++ standard was first created?
Because the authors didn't want to force a particular behaviour on implementers.
Because they are OS dependent. I.E. unix/linux/macosx use pthread
API, Windows use its own API, and so on and so forth...
They could have been included into libstdc++
, but I guess it is not easy to include all current and future thread features on a common API. The same way, DB access is not included in libstdc++
either.
Threads are an OS thing, so they aren't really a function of the programming language as much as they are the libraries provided by the system. POSIX threads for example, can be used in C++ but aren't really "part" of it.
The current Standard is from 1998. There were different thread implementations, and there wasn't the body of experience with using threads that there is twelve years later. If C++ had had a standardized thread library, it would likely have worked poorly with some common thread implementations, and might well have been difficult to adapt in the future.
It's twelve years later now, and we know a whole lot more about how threads are used, and the more widespread use created more interest in standardizing them, so the upcoming C++ standard (which I hope will be official in 2011) will have a section on threads in the library.
Threads certainly did exist when C++ was being standardised during the 1990s. However, Windows and Posix have very different threading APIs. Designing a library of the quality you'd want from a standard language library, giving all the threading primitives you need and mapping well onto both popular APIs (and others), required a large effort. Including it in the initial standard would have required either delaying standardisation, possibly for years, or including a specification that may well have had significant shortcomings.
That work has been done over the last decade or so (initially as the Boost.Thread library), and will be included as the standard thread support library in the next version of the standard, in addition to language-level features such as thread-local storage.
I think the main reasons are
- specifying threading behavior into the language needs a lot of work and understanding, which nobody had available back then
- nobody had a great idea about a good threading API, and there was no one existing library that seemed good enough to be used as a base for further work
- the standardization committee was swamped with enough of other work (like incorporating the STL into the standard library)
- that standard was late as it was; it took more than ten years for the first version to emerge, with quite a lot delay due to "last-minute changes" (
std::auto_ptr
, STL), and I think the main feeling was to better have something out sooner than to keep waiting for an infinitively delayed perfect standard; I think back then most people didn't think it would take so long for the next version to get finalized
After the standard was ratified, boost was founded by members of the library working group as a testbed for libraries which were desirable to have in the std lib but for which there wasn't enough time to make it for the final version. There, much of the work needed for adding threading support to C++ (namely inventing a good threading library) was done.
There is a lot of work involved in creating a thread class and C++0x has largely addressed this by adding the thread, mutex and atomic libraries but it took a lot of work from a lot of folks.
Orginazationally, remember that C++ is a very large language and changes happen quite slowly to it because of the complexity of the language and the amount of code and industry that rely on it; it takes a long time to get ratify changes into the standard because of this.
Also threading and synchronization has typically been an OS provided functionality so any additions needed to be compatible with the common implementations and possible without massive changes to the platforms (or noone would be able to implement the standard).
Technically, it isn't sufficient to just add a thread API, C++ was also missing a cohesive memory model, i.e. how do variables interact across thread and how do we allow for the wide range of memory models to be expressed in code succinctly (and performantly). Most of us are fortunate enough to work on primarily single-threaded x86 based software which has a very forgiving memory model, but there is other hardware out there that is not as forgiving from a memory model perspective and where performance penalties can be quite harsh.
The library addresses the memory model issue by providing atomic variables with forgiving defaults and explicit control.
The library provides another key piece of functionality for portable threading by providing synchronization classes.
Finally was added and if you haven't read the history on the working group site, it's interesting, but simply replacing CreateThread, QueueUserWorkItem or a pthread invocation was a thread object isn't quite enough. Thread lifetime, state management and thread local storage all had to be thought through.
All of this took a long time to get right and as others have mentioned most of it was in boost for quite awhile to ensure that major issues were worked through and it was cohesive before making it into the new standard.
One of the challenges that was faced by the standard committee 12 years ago is the differences in the parallel hardware. A key distinguishing feature between various high performance computing architectures is the way the permit parallel computation, and only one of those many models is really similar to the posix threads abstraction, SMP.
If you compare this situation to the one faced by the Fortran language, which targets this sort of hardware specifically, you see that each vendor supplies a Fortran compiler that they have extended to support the special parallel computing features the hardware provides.
This can be seen as relevant in todays terms by comparing typical x86/x64 white box compute nodes, which usually have up to 4 sockets, which translates into 24 cores with a single shared memory to a Gaming PC with multiple nVidia or AMD GPU's. Both are capable of startling compute throughput, but to get the most out of either requires a very different programming style. In fact, different workloads will work significantly better on one architecture than the other, depending on the exact nature of the specific algorithm.
That being said, it was probably impossible, 12 years ago, for the standard committee to identify how it should address each of these issues. Now, however, the answer is much simpler. C++ is not aimed at anything besides SMP hardware; those are served by other languages and tool-chains such as OpenCL or VHDL.