views:

1361

answers:

4

So I'm working on designing an OS, and I'm working on designing (not coding yet) the kernel. This is going to be for an x86 operating system, and my target is for more modern computers, so RAM can be considered to be at least 256M or more.

My question is this: What is a good size to make the stack for each thread run on the system? Or better yet, should I try to design the system in such a way that the stack can be extended automatically if the max length is reached?

I think if I remember correctly that a page in RAM is 4k (4096 bytes) and that just doesn't seem like a lot to me. I can definitely see times, especially when using lots of recursion, that I would want to have more than 1000 ints in RAM at once. Now, the real solution would be to have the program doing this to use malloc and manage its own memory resources a bit, but really I would like to know the user opinion on this.

Is 4k big enough for a stack with modern PC programs? Should the stack be bigger than that? Should the stack be auto-expanding to accomodate any (reasonable) size? I'm interested in this both from a practical developer's standpoint, and a security standpoint.

Or, on the flip side, is 4k to big for a stack? Considering normal program execution (especially from the point of view of classes in C++) I notice that good code tends to malloc/new the data it needs when classes are created, to minimize the data being thrown around in a function call.

What might also be important here, and I haven't even gotten into this, is the size of the processor's cache memory. Ideally, I think the stack would reside in the cache to speed things up, and quite frankly I'm not sure if I need to achieve this, or if the processor can handle it for me? I was just planning on using regular boring old RAM for testing purposes.

I can't decide, so I ask you.

Thanks

A: 

Why not make the stack size a configurable item, either stored with the program or specified when a process creates another process?

There are any number of ways you can make this configurable.

There's a guideline that states "0, 1 or n", meaning you should allow zero, one or any number (limited by other constraints such as memory) of an object - this applies to sizes of objects as well.

paxdiablo
+4  A: 

Stack size depends on what your threads are doing. My advice:

  • make the stack size a parameter at thread creation time (different threads will do different things, and hence will need different stack sizes)
  • provide a reasonable default for those who don't want to be bothered with specifying a stack size (4K appeals to the control freak in me, as it will cause the stack-profligate to, er, get the signal pretty quickly)
  • consider how you will detect and deal with stack overflow. Detection can be tricky. You can put guard pages--empty--at the ends of your stack, and that will generally work. But you are relying on the behavior of the Bad Thread not to leap over that moat and start polluting what lays beyond. Generally that won't happen...but then, that's what makes the really tough bugs tough. An airtight mechanism involves hacking your compiler to generate stack checking code. As for dealing with a stack overflow, you will need a dedicated stack somewhere else on which the offending thread (or its guardian angel, whoever you decide that is--you're the OS designer, after all) will run.
  • I would strongly recommend marking the ends of your stack with a distinctive pattern, so that when your threads run over the ends (and they always do), you can at least go in post-mortem and see that something did in fact run off its stack. A page of 0xDEADBEEF or something like that is handy.

By the way, x86 page sizes are generally 4k, but they do not have to be. You can go with a 64k size or even larger. The usual reason for larger pages is to avoid TLB misses. Again, I would make it a kernel configuration or run-time parameter.

bog
I never thought about making it something to configure at compile time. I also seem to like 4k, the control freak tells me that if you really need to use more than that much memory, you should be doing it with malloc. ^_^ That said, there are crazy AI programmers out there that love their recursion.
Nicholas Flynt
if you are allocating many of the same-size thing over and over again, you should consider a fixed-block allocator. It can be much more efficient than the general-purpose malloc.
bog
+1  A: 

I'll throw my two cents in to get the ball rolling:

  • I'm not sure what a "typical" stack size would be. I would guess maybe 8 KB per thread, and if a thread exceeds this amount, just throw an exception. However, according to this, Windows has a default reserved stack size of 1MB per thread, but it isn't committed all at once (pages are committed as they are needed). Additionally, you can request a different stack size for a given EXE at compile-time with a compiler directive. Not sure what Linux does, but I've seen references to 4 KB stacks (although I think this can be changed when you compile the kernel and I'm not sure what the default stack size is...)

  • This ties in with the first point. You probably want a fixed limit on how much stack each thread can get. Thus, you probably don't want to automatically allocate more stack space every time a thread exceeds its current stack space, because a buggy program that gets stuck in an infinite recursion is going to eat up all available memory.

Mike Spross
+1  A: 

If you are using virtual memory, you do want to make the stack growable. Forcing static allocation of stack sized, like is common in user-level threading like Qthreads and Windows Fibers is a mess. Hard to use, easy to crash. All modern OSes do grow the stack dynamically, I think usually by having a write-protected guard page or two below the current stack pointer. Writes there then tell the OS that the stack has stepped below its allocated space, and you allocate a new guard page below that and make the page that got hit writable. As long as no single function allocates more than a page of data, this works fine. Or you can use two or four guard pages to allow larger stack frames.

If you want a way to control stack size and your goal is a really controlled and efficient environment, but do not care about programming in the same style as Linux etc., go for a single-shot execution model where a task is started each time a relevant event is detected, runs to completion, and then stores any persistent data in its task data structure. In this way, all threads can share a single stack. Used in many slim real-time operating systems for automotive control and similar.

jakobengblom2
I can see two flaws here. The main one is allowing the system to overflow the stack on its own. If I do that all it takes is a single recursive infinite loop and blamo, the stack for 1 process suddenly consumes the entire PC... As for the one shot deal, this is a multi-tasking OS, so no beans there.
Nicholas Flynt