views:

1072

answers:

14

I like operating systems and would eventually like to become a OS developer mostly working on kernels. In the future will C still be the language of choice and what else should I be trying to learn?

+2  A: 

C is pretty much it, with a fair amount of assembler. Important topics for OS kernel work include:

  • Principles of caching and cache management
  • Virtual memory, TLB management
  • CPU and system architecture
  • Storage hierarchies
  • Concurrent programming techniques (mutual exclusion, locking, ...)
  • Algorithms and data structures
Lance Richardson
Well I know that much, I am just wondering about the future of OS's in general. ya know like cloud OS's and stuff, is that still just going to be C.
Recursion
@unknown google, I guess Lance is talking about kernel and not the OS. Kernel will be written mostly in C, even for your so-called Cloud.
Alphaneo
Don't worry. Cloud OS's are still written in C (at least the low levels). The C programmers just aren't allowed to speak at the press conferences.
Matthew Flaschen
A: 

Most definitely! You should also learn at least one assembly language/hardware architecture.

Nikolai N Fetissov
I agree that assembly knowledge is critical, but not for direct use. It's required to understand what your kernel is managing, but it doesn't have to be written directly. In managed OSes, the only part of the kernel that directly touches assembly (well, machine code) is the compiler (which guarantees memory safety, amongst other things), for instance.
Cody Brocious
@Cody, you write boot-loader in managed code, is it?
Alphaneo
You most certainly could, but there's simply no point -- if your bootloader is bigger than a few thousand lines of code, you've got a problem. The goal of the bootloader is to do the least number of things before passing control to the kernel, so doing it in anything but assembly is asinine -- this is not at /all/ indicative of the kernel at large.
Cody Brocious
OS is written in C for portability. Assembler is still used for atomic operations, traps/syscalls, bootstrap, etc. - anything that requires platform-specific instructions. And compilers are NOT part of any OS I know. These are userland programs. They do not guarantee memory safety - kernel does.You do not have to know assembler to understand an OS, but it most certainly helps.
Nikolai N Fetissov
@nickf3, Look at any managed OS on Chris's list and you'll see what I'm talking about. The compiler is a part of the OS by necessity, and rather than relying on antiquated process separation mechanisms, the compiler guarantees that the code is safe.
Cody Brocious
I hate to get into semantic argument, but compiler is still not part of a running kernel, unless you mean something like jit :) Compiler builds you a kernel for a particular instruction set (abstract in these cases.) And I agree, there's ton of room for research in operating systems, but one still needs to understand C and asm (and compilers, and linkers, and caches, and locks, and interrupts, etc, etc) to get anywhere in that research.
Nikolai N Fetissov
@nickf3: Yes, Cody is referring to the dynamic compiler (JIT is such an old-fashioned term, in my opinion). Static compilation is just so old-school. :-)
Chris Jester-Young
Pardon my ignorance, but how many of these new-school OSs do you have running in production :)
Nikolai N Fetissov
+1  A: 

Microsoft is in the process of rewriting some of Windows in .NET however I doubt that much of the kernel will be touched.

However there are projects like Cosmos ( http://www.gocosmos.org/index.en.aspx ) which give us hope.

Jonathan Parker
what kind of hope?
Alphaneo
Hope of breaking out of the C/assembly hegemony, of course. :-)
Chris Jester-Young
The first part of the first sentence above isn't true. There has been no re-write of parts of Windows in .NET
Foredecker
Hope that we can move past old-style OSes where process separation is the norm (resulting in security risks and extreme performance penalties) and the security of the OS relies on millions of lines of code.
Cody Brocious
@Cody, Are those security risks, due to the language.
Alphaneo
Actually, yes. Entirely so. When you move to managed code, the only way one bit of code can access another is via the objects it's given. In unmanaged code, the only protection is via process separation, which we've seen time and again to fail, primarily at syscall time where the kernel has to juggle address spaces.
Cody Brocious
@Cody, most of the OS-kernel developers, do not develop Linux or Windows kernel, but develop some tiny embedded kernel. And I assumed we were talking about the latter.
Alphaneo
@Alphaneo, Every kernel faces the same underlying problems, really. Whether you're hacking a random custom embedded OS or BSD, you have the same tradeoffs with respect to the flexibility of the language.
Cody Brocious
+5  A: 

Cody didn't want to be bothered answering this, so I'm passing this on on his behalf. :-P For some examples of OSs written in managed languages, as opposed to C or assembly, look at:

Of course, Cody also didn't want to mention this:

Chris Jester-Young
Doesn't addres the question. Those are OSes, not programming languages.
T.E.D.
Except, they demonstrate that (by virtue of all of them being in a managed language) it's not necessary to write an OS in C or assembly, and that it can be written (in some cases including the initial booting stage) entirely in a managed language.
Chris Jester-Young
SharpOS and Cosmos are written in C#. Renraku is, last I heard, written in Boo. The other ones, you'll have to find out yourself. :-P
Chris Jester-Young
OK, I see what you are getting at now. Move those comments into the answer and I'll take back my downvote.
T.E.D.
It's probably too late to undo the downvote (you only have a smallish window of time to undo a vote, to prevent people from gaming the system), but I'll edit the post anyway.
Chris Jester-Young
Bottomline is that all except Singularity aim for a fully managed pipeline and codebase based on the CIL standard. Singularity does, as far as I know, use a c++ based kernel.
Dykam
+1  A: 

No, it is not "it". Kernels are generally written in C with a bit of assembler sprinkled in. But the OS is written in all sorts of language. But even there, C++ can be used without too much trouble. So can many other languages. Linux is written by C fanatics who fear and loathe everything else, which is their problem. Windows is written in a big mix of C and C++, and probably with a some bits of old Pascal code as well. And these days, chunks of .NET are turning up as well. OS X uses Objective-C for much of the OS code.

The same advice applies as in all other areas of programming:

  • Know your stuff
  • Don't limit yourself to the One True Language.

The kernel is the only area where somewhat "special" rules apply. But the kernel is tiny. The vast majority of the OS can be written in any language.

You'll certainly need to know C, yes, but just knowing C is nowhere near enough.

jalf
+1  A: 

You might want to have a look at the Singularity project from Microsoft (also on Wikipedia):

Singularity is an experimental operating system being built by Microsoft Research since 2003. It is intended as a highly-dependable OS in which the kernel, device drivers, and applications are all written in managed code.

Only an extremely small part of this OS is actually written in C, and the rest is written in higher level languages (Sing#, an extension of C#). In the future I believe you can expect to see much more of this kind of thing becoming available.

Greg Hewgill
Are they intending to remove the C eventually?
Matthew Flaschen
I don't know. Wikipedia claims that the existing C and assembly language code is responsible for the x86 interrupt handling, which makes sense. Since this is very low-level code, it's difficult to have a general purpose compiler generate the correct code, unless you build that specific capability directly into the compiler. Or, perhaps some other platforms (other than x86) offer an architecture such that assembly language code is not required for interrupt handling. In any case, interrupt handling is by far not the most interesting part of the kernel. :)
Greg Hewgill
+2  A: 

Actualy, there is quite a bit of room in the core of a modern OS for C++ code. I just looked and the Win7 core kernel tree has quite a bit of C++ code. Note that many sub-systems remain in simple C. There are a few reasons for this

  1. C is the original language of the NT based OS
  2. C is very, very well understood by key people
  3. Well written C can be the most straight forward code to debug - especialy in kernel mode.

That being said, many teams and people have found well written C++ to be an effective tool for core OS work.

There is nothing about C++ that prevents it from being used to write core resource management code like a scheduler, memory manger, I/O subsystem, graphics sub-system, etc. etc.

As others have pointed out - any kernel work will always require some bit of assembly language.

Foredecker
Moving from C to C++ in the kernel is a tiny, tiny change. It cleans some things up (in theory), but in the end, it's really the same code. Look to managed OSes to see real innovation -- that's where we're going.
Cody Brocious
Where can I see that if you dont mind?
Recursion
See the post I made above. It's full of neat links from Cody.
Chris Jester-Young
A: 

If it is kernel you are talking about, then you need to learn a language that will enable easy access to the underlying hardware, faster. I can only think of

  • C language and
  • Assembly

AFAIK, some parts of the boot-loader will be written in assembly, and from then on, C. There are many open-source easy-to-understand operating systems available, like for example the latest TOPPERS. Try to look into it.

I guess, as a OS-kernel developer, you will worry more about the ways to efficiently access underlying hardware (like processor and memory) rather than the choice of the language. I bet, most of the times, we will be tempted to use assembly

Alphaneo
Want to optimize a kernel? Reduce the cost of syscalls and task switching, optimize the algorithms used. Singularity has proven that a managed mode kernel can be /faster/ than an old-school kernel, because of the safety provided.
Cody Brocious
@Cody, theoretically, managed code can *NEVER* be faster than an assembly code, or a well optimized C code. And if you say it is faster, then all I can say is **vow**
Alphaneo
@Alphaneo: Wrong, managed code can be compiled dynamically, whereas usually C or assembly code is too static for that. By "dynamically", I mean the code could be optimised based on runtime metrics. Look at how Java's performance is (or .NET's for that matter) compared to more static systems.
Chris Jester-Young
@Chris, does that mean .NET is faster than Assembly? at-least in some case.
Alphaneo
Yes, unless your assembly code has some way to recompile itself based on runtime metrics. I'm talking about long-running processes, by the way (which kernels are), where you can collect statistics along the way.
Chris Jester-Young
@Chris, Assembly code, re-compile? what are you talking about?
Alphaneo
Managed code can never be faster? Really? Ok, let's try an experiment. Assemble some code for 8086, then move it onto an x64 system and run it. Is it using all 16 64-bit registers? You may want to look into dynamic code generation and/or recompilation before answering that.
Cody Brocious
@Cody, some people are talking about things that exist in theory (C#), and some people here are talking about the practical-near-term-future (C and Assembly). Anyway, I have not seen any boot-loader written in c#, and could not imagine one being written in c# ...
Alphaneo
@Alphaneo, This doesn't exist just in theory. All of the OSes linked by Chris (with the exception of Singularity) use managed code for every single piece of the OS after the bootloader. The only reason a managed bootloader hasn't been written is because it's unnecessary. GRUB does the job just fine. You've yet to give a single reason a bootloader couldn't be written in pure managed code.
Cody Brocious
@Cody-wrote "you've yet to give a single reason a bootloader couldn't be written in pure managed code" ... VOW, nice question.
Alphaneo
@Alphaneo, I don't understand if you're being sarcastic or not, but I'd really love for you to put forth a reason why it can't be done. There are lots of misconceptions around managed code at this level, and I'd love to help clear them up.
Cody Brocious
@Cody, I will have to accept that I was sarcastic. Probably due to my misconception or to some extent ignorance about managed code. And since you seems to be telling a lot about managed code, I promise to study more about "managed code". BTW, I have written BSP packages for a number of OS, and entirely in assembly and C, and I could not believe (think, for that matter), that I can write those BSP's in .NET.
Alphaneo
@Alphaneo, Depending on the constraints you were working under, it's possible you couldn't. It's very easy to generate /damn/ fast code from CIL, but generating small code is very difficult. In terms of actual capability, the only thing I can think of that could cause problems for you is if you required precise timing, e.g. you have to throw pixels at the PPU every X cycles (like the NES).
Cody Brocious
@Cody ...I blogged on this issue refer, http://indcricket.blogspot.com, I will continue to study, even now, in many places like DSP programming Assembly still rules.
Alphaneo
+4  A: 

I think it's safe to say that low-level parts of operating systems (e.g. the kernel) will continue to be written in C because of its speed. Like mentioned elsewhere, you will need to know assembler for certain parts of the kernel (something needs to load the kernel into memory). But you can work on the kernel with little or no assembly knowledge. A good example would be if you're implementing a file system.

Don't worry about what language the operating system is implemented in. What's important is how an operating systems are used, and what can be done to improve them. A good example is when Unix first came out. The file system had the inodes at the front of the disk, and data in the remaining space. This didn't perform very well as you were seeking to different parts of the disk for all files. Then the Berkeley Fast File System was created to make a disk aware file system. This means having inodes near their corresponding data. I'm leaving out a lot of details, but I hope this illustrates that it's more important to think about how an operating system can be improved rather than what language it will be programmed in.

Some recent trends in operating systems are virtualization and distributed computing (see Google's paper on MapReduce). File systems, security, scheduling (especially with multi-core processors), etc are continually areas of interest even though these problems are not new.

Here are some resources if you want to learn more about kernel development:

  • Linux Kernel Newbies - Resource for those who want to get started on modifying the Linux kernel.
  • xv6 source - x86 port of Unix version 6. Used by MIT to teach an operating systems class. Simple, and easy to extend (more info).
  • Linux Kernel Map - Call chain of system calls in Linux. Useful in visualizing what a system call does.

Bottom line: Start getting familiar with the kernel and read papers on what researchers are writing about (USENIX is useful for this). This knowledge is much more valuable than learning a new language, as most concepts from one language can easily be transferred to another if there does happen to be a shift in what operating systems are written. Hope this helps!

mgriepentrog
This ignores all the research being done on managed kernels, as Chris has linked. In addition, sticking with C has nothing to do with speed (Singularity has proven a managed OS to be much, much faster due to the safety of managed code) and everything to do with being the standard.
Cody Brocious
I mentioned C because it's still fairly common, and it doesn't hurt to know if he doesn't already. I don't try to claim it's faster or better. The main goal of my post is to emphasize learning about what is being researched (including managed kernels), and learning how the kernel works.
mgriepentrog
Why do people keep saying that "Singularity has proven a managed OS to be much, much faster". It seems that very limited performance testing was done. Run a full fledged app like an Oracle RDBMS through all the popular benchmarks on an OS like that, and it will be more convincing. Some apps do things like align data to cache lines to mitigated false sharing. How would you do that when running on an OS like Singularity?
RussellH
+20  A: 
Norman Ramsey
+1. I would upvote this a dozen times if given the opportunity.
Cody Brocious
Managed code requires a virtual machine to run on, which must be produced in a language that can compile to object code. In essence singularity is analagous to a standard microkernel design. with the vm taking place of the kernel and the servers implemented in managed code.
Jay Dubya
There's no reason the managed code VM can't be written in managed code itself, from the ground up. You can compile the managed code to machine code ahead of time, then bootstrap up from there. That's the approach taken by SharpOS, Cosmos, MOSA, and Renraku.
Cody Brocious
@Jay: you might check out Dan Grossman's language 'Cyclone', which is a Frankstein monster but does provide a kind of 'managed' code with no VM --- it compiles directly to assembly code (maybe typed assembly language). It's a language not an OS, and it's a bit of a monster, but it does give one a sense of the possibilities going beyond just a VM.
Norman Ramsey
@Cody, Norman: Just had a look at the wikipedia articles for SharpOS and cyclone, they compile their kernel bytecode down to object code to allow the machine to be booted from it, which is the point I was trying to make - You still end up with a core module in object code that supports the remainder of the OS.
Jay Dubya
@Jay: I think I see your point. Dunno about SharpOS, but Cyclone compiles *everything* to assembly code. There is no VM, so code is "managed" without "requiring a VM to run on". In the case of Cyclone, the type checker provides the guarantees that in other systems are provided by a VM.
Norman Ramsey
@Norman: My understanding was that "Managed code" referred to code that was compiled to target Microsofts virtual machine. The point I was trying to make is that once compiled to object code this does not apply. Any additional security or reliability provided by the VM environment is moot.Do you know if the compile time type checking is any superior to that performed by c++?
Jay Dubya
I agree with the first bullet, not with the second. I wish I could agree. C sucks, and is probably single-handedly responsible for most security issues. But I see no evidence of anyone moving to use anything better for serious OS development.
T.E.D.
@Jay: Yes, the compile-time type checking is superior to C++. For example, it can guarantee absence of pointer errors.
Norman Ramsey
A: 

You should definitely be fluent in C.

As others have pointed out, there is no reason that an operating system has to be written in C, and there is a lot to be gained by using more sophisticated languages. But if you're going to work on operating systems in the real world (i.e., not in academia or a research lab) there are a couple of realities that you have to live with:

  1. Existing operating systems are huge, often many millions of lines of code, and are written in C or C-derivatives, such as Objective-C or C++.
  2. New operating systems take hundreds of engineer-years (and many calendar years) to reach and match the functionality and robustness of existing operating systems.

As a result, it's hard for me to see how and when the world will move away from C-based operating system kernels. Yes, it's technically possible. But the cost may be too high. If anything, the trend seems to be toward consolidation on a small number of OS families---Windows, Linux, and BSD---all C-based.

It would be interesting to know what research has been done, or what tools and techniques might be available to evolve an existing code-base (such as Linux) to a better language. I think this would be a much more viable approach than getting the world to adopt a completely new OS.

Keith Smith
+1  A: 

I think its a pretty safe bet that serious (non experimental) OS development will remain in C (and assembly) for the forseeable future.

The proof I submit is Ada. It can get as bare-metal as C, provides better control over data placement, and has safer default behavior for just about everything (eg: array bounds checking). From the point of view of an OS developer, it is either equal or superior to C in any technical parameter you can think up. It has been available for over 20 years now (ok...reasonably-priced for perhaps only 15).

So if people were looking for a technically superior language to C, you should see OSes all over the place written in Ada instead right? What I actually see is one serious OS implemented in Ada. It is no longer supported in favor of a reimplementation in C.

The barriers to other languages in OS development are not and never have been technical. I don't see C's non-technical benifits going away any time soon, and nobody is ever going to overcome them by simply designing a better language.

T.E.D.
A: 

I've done extensive programming in both the Windows NT and Linux Kernel. And I can assure you that as long as these 2 OS's are around C will be used in the Kernel. I think it's a multitude of reasons, but the easiest answer is time. Like previous posters mentioned the amount of time it would take to rewrite the Kernel in a different language is not worth it. And it wouldn't just be porting the code. The kernel would need some serious design modifications. Personally I think C is the most suitable language for a Kernel. Being able to manage your open memory and dynamically allocate and free your own memory is crucial when you are working in the kernel. Especially if you are working with paged memory. The stack size you are allotted in Kernel mode is also generally smaller than user mode so again memory efficiency is crucial. C also allows programmers to build beautiful data structures that don't contain all the bloated overhead that managed languages have. In my opinion a struct can also be used just as effectively as an Object, but again without all the bloated overhead. Managed languages also need to be "managed." In the Kernel you don't have anything cleaning up your messes. Don't get me wrong, I love C# and I think the .NET framework is beautiful, but if you are in the kernel C is and will continue to be it.

Ian
A: 

C++ is supported for kernel mode development on Windows, but you can't use exceptions and RTTI easily. I believe that there is no reason to write code in C today, since the overhead of C++ is negligible (any tracing/debugging infrastructure will be far more costly than extra dereference for virtual function call). In fact most of the Windows DDK implement object oriented patterns with C, which is just inconvenient compared to C++.

If you decide to use C++ for kernel mode development, you will need to override the new operator to choose whether to allocate a class on pageable or non-pageable memory. Some nice macros might come handy there.

Vladimir Lifliand