views:

942

answers:

11

This is a pretty-much theoretical question, but..

How much of an operating system could be written in a language like Python, Ruby, Perl, or Lisp, Haskell etc?

It seems like a lot of the stuff like init.d could trivially be done in a scripting language. One of the firewall-device-OS's (m0n0wall) uses PHP for its system-configuration (including on boot). And one could argue that "emacs is an OS, mostly written in Lisp"..

Of course there are bits that would have to be assembly/C, but how much could be regular .py/rb/.pl/.el/.hk files..? It might not have the best performance, but it would be, by far, the most easiest-to-modify OS ever...

+3  A: 

House - Haskell User's Operating System and Environment. It is even bootable inside VM and you could play with it.

Sources are very readable, IMO.

ADEpt
+1  A: 

Python does not natively provide constructs to talk directly to the hardware, like raw pointers for memory-mapped I/O and many other constructs provided by C/ASM. However, there is proof that most everything in an OS can be written in a more abstracted language; the Singularity OS from Microsoft is written almost exclusively in variants of C#. There's an extremely small amount of C/ASM to do interrupt handlers and such, but everything else, including what most of us consider to be "the kernel" can be done in essentially any Turing-complete language.

It should be noted that Singularity's choice to implement these low-level constructs in C/ASM should not be interpreted as a fundamental limitation of the syntax or other aspects of high-level languages. One could certainly make a variant of Python that emitted and dealt appropriately with the necessary assembly code.

Matt J
There's no reason that even the interrupt handlers and such have to be C/ASM. For instance, Cosmos and SharpOS both handle every bit of this via C# (and their custom way of emitting inline asm, X# (Cosmos)). Just need to make the compiler emit this code.
Cody Brocious
I would argue that Cosmos and SharpOS do nothing more than wrap ASM in a more portable, verbose (but type-safe) syntax. The code here (http://www.gocosmos.org/Blog/20080428.en.aspx) looks a lot like assembly. Even if you could write this in Python, it would *not* be at all Pythonic at that point.
Matt J
I think Cosmos hasn't gone far enough with abstraction. The compiler really should be handling these low-level details and exposing it in a logical way to the high level. That said, there's a looong way to go for managed kernels.
Cody Brocious
The argument that any program of the type we're used to reasoning about can be implemented in any Turing-complete language is of course correct, but I find it to be a bit disingenuous, unless that's really what the OP was asking. No, ASM doesn't do any magic, but the hardware executes *instructions*
Matt J
.. and to say you could do it in Python isn't totally satisfactory, as you'd really have to bastardize the language in some fundamental ways to get it to emit ASM and put it in the interrupt vector table, for instance.
Matt J
Sure, anything can be implemented in anything, but it doesn't change the fact that there are high level solutions to these problems. It's akin to writing C off because it compiles to assembly, IMO.
Cody Brocious
I suppose you could infer that I suggested that Singularity's choices were out of absolute necessity, but to clarify: this is not a fundamental limitation, it was just a more natural choice given the goals of the research project.
Matt J
+15  A: 

Technically, any of it could be, if you write a compiler to do so. OSes have been done in Java (JNode), .NET (MOSA, Singularity, SharpOS, Cosmos), Haskell (HOUSE), Python (Unununium), etc.

Edit: I see a lot of people talking about the very lowest level being an area where this couldn't be done; this isn't true.

There's no reason that the compiler for X language can't be extended to handle any low-level operation and expose it to the language. All functionality can be achieved from any language, it's simply a matter of picking the right tool for the job. Sometimes this is Python, sometimes this is C, sometimes this is assembly.

Look to projects like Cosmos and SharpOS to see a pure high-level OS Done Right (TM).

Cody Brocious
Hahah, thanks for the catch. It's way past my bedtime ;)
Cody Brocious
JNode's microkernel is written in assembler I believe - wouldn't that be required for a language that requires a VM? At the very lowest level any usable OS contains at least a little assembler. C minimizes the assembler since it allows low level operations natively.
tloach
Thanks for the correction on JNode, but no, there's no fundamental requirement for your OS to have any assembly. You extend the compiler to emit the proper machine code to get the job done. You can build a subset of the functionality of your platform in your platform's language then bootstrap up.
Cody Brocious
For instance, a .NET (Nemerle and C#) kernel of mine was done in pure .NET. The compiler's small runtime had no handling of objects, just exposed memory, interrupts, etc. The object system itself was built up in Nemerle/C#. Completely self-bootstrapping.
Cody Brocious
C is, essentially, intended to be portable assembly plus a couple not-completely-low-level features thrown into the mix. So by default it's a good language for writing operating systems. But it's a bad language for writing business applications. Conversely, a language with a lot of useful high-level constructs and a lot of useful layers of abstractions would be great for writing business applications, but bad for writing operating systems. Nevertheless, you can write both operating systems and business applications in both kinds of languages, if you had to.
Justice
@tloach - Let's define what assembly is: it is a one-to-one programming language. Any other language which compiles can be used in place of assembler. I've seen boot loaders in Delphi (which by the way, was made to work on windows only) and of course C. At most, you might have to use inferred assembly in the target language (such as pascal).
Christian Sciberras
+2  A: 

Beyond the kernel (and by this, I mean kernel, microkernel style), and something to compile the runtimes for each of said dynamic languages, just about anything and everything COULD be if you were building your own operating system. It's just not practical. Heck, init.d is written primarily in sh as far as I'm aware. But sh, while not powerful, is VERY lightweight and as far as I know, efficient in what it does. Higher level languages like Python, Perl, etc, could handle it fine, but it'd be alot slower, and would take alot more memory for instances of interpreters.

It's possible, it's just not practical.

Matthew Scharley
+1  A: 

It's difficult to imagine kernels / device drivers etc written in (e.g. Python) - the memory management would be a bit of a headache.

On the other hand, almost all the userspace code could be. Under Linux, there is no requirement that "init" be a native machine-code binary - it can be a python script or something, no problem.

MarkR
+1  A: 

The one interesting outcome from Singularity is, you don't need a MMU (memory management unit) in the CPU any more, since all userland code is "managed". I could see this beneficial in embedded scenarios, using non-MMU Linux and on top of that scripted applications.

akauppi
+3  A: 

See Genera / OpenGenera for an example of an OS written in Lisp that was actually in use for quite a while on LispMachines.
Haskell has House.

Galghamon
+1  A: 

As long as the programming language has the ability to manipulate binary files, you could write a complete OS in the particular language. This is not to say that it is easy, or practical. It just makes sense that, if your chosen language can manipulate binary, then you can go as low-level as you need.

+1  A: 

I would say this is not possible. Responses to this question keep referring to changes to the language or using the language to generated low level (kernel) code. This is just using one language to write another language. While I agree that both of these would allow you to then write an operating system, I would then argue that it is now not the same language. So, an operating system could be written in many different languages but not every language (without change or language by pass) can be used to write an operating system.

The finial answer to the original question is almost all, but not all. The only acceptation are languages which can access CPU level instructions.

Jim Kramer
You do realize that every language gets translated to another, right? C is compiled to assembly, which is then assembled to machine code. This is like saying that writing a kernel in C is really writing it in assembly...
Cody Brocious
yes, but C# is compiled to IL, not native code (unless you use Mono's static compilation feature). So strictly speaking, using C# to create an OS is not possible. Using a modified version of the C# compiler that generates asm is. Its debatable whether that would still be C#.
gbjbaanb
A: 

"cleese" - an operating system written almost entirely in Python

dbr
I get a 404 on the link
André
Oops, fixed - thanks!
dbr
+1  A: 

I'm surprised nobody has mentioned Java hardware. It should be an inspiration for us to further the evolution of hardware by creating an even higher level processor.

There's another project I just found called "Pycorn".

If there was a Python bytecode processor, it would be feasible to make a fast operating system in 100% Python. The processor could implement the entire CPython bytecode, or anything that is compatible with the Python language (But not C modules!). The processor would have to handle reference counting, classes, and objects. Native hashing for dicts would be very helpful, all the complex data structure manipulations which Python currently needs in software should be done purely in hardware. There would be no concept of pointers, which I see as a prime motivation for building such a processor, as it would be impossible to smash the stack.

Everything would be objects! The kernel itself would call methods on the memory object, although you wouldn't need to touch it much since the hardware will handle allocation and garbage collection anyways. Interrupt handlers can simply be set to python methods. MSRs, caches, debug registers, and I/O ports are objects.

There's a interesting discussion about implementing Python on an FPGA here.

On another note, (pertaining to a Python O/S on a non-Python processor) to the people claiming you can't make inline assembly Pythonic, it's pretty simple to just emit assembly from an abstraction, ex:

asm = MetaASM()
asm.r1 = 1234
asm.r2 = r1 + 5
asm.io.out(r1)

You could switch to architecture specific assembly for performance needs or architecture specific operations / registers when necessary:

asm = ASM("IA-32")
asm.xor(asm.eax, asm.eax)
asm.cr0 = asm.eax
asm.invtlb
asm.fs.0x00123456 = asm.eax
asm.al = 123
asm.dword.ptr.eax = 1234 # mov dword ptr [eax], 1234
asm.push(asm.eax)

CorePy comes to interest on this topic.

Longpoke
On second thought, we should probably stick to Java processors, as Java code is much more deterministic than Python, and therefore will always be much cheaper for faster speeds. In any case, languages that do insecure things just from accessing memory out of bounds have got to go.
Longpoke