views:

84

answers:

2

I'm trying my hand at working with threads at an assembly language level. I'd like to store information about the current thread, including things like the thread ID, etc. My current thought is placing a pointer to the thread information structure at an aligned address, such as (ESP & 0xFFE00000) for a 1MB stack.

The immediate problem I see is it would require special handling if I ever wanted the stack to exceed 1MB, so I came up with another option: Store all thread data structures in a linked list (or array), and include the start/end stack range for the thread as part of the structure. I'd keep a pointer to the head of the list (or array) at a known address - such as a fixed offset from the image base.

Here are the obvious advantages each faces:

  • Method 1 (aligned address)
    • Faster access to the information and no synchronization required
    • But stacks have to all be the same size and aligned to that boundary
  • Method 2 (common storage)
    • Stacks don't have to be a fixed size and they don't have to be aligned to such a large boundary
    • But I must synchronize access to the data
    • But it won't have O(1) access time without a hash table (or a fixed cap on the number of threads with a binary search - sneaky), and a hash table is still slower than direct access

Should I use one of these methods, or is there a better way to have access to this information in my threads?

A: 

Carving out an area of the stack that isn't otherwise tied to a call frame or push/pop actions is a recipe for disaster. Sooner or later either the stack will get bigger than you expect. Either you'll start using call frames as thread IDs, or the thread ID will get used as a return address.

You don't say what threading package you're using, but if it's PThreads take a look at pthread_setspecific. Other threading packages should provide similar functionality.

But on a larger sense, what information does your thread need to know about itself? Multi-threaded programming becomes a lot easier if you separate (in your mind) the executable code from the environment (thread / process / whatever) where that code is being executed.

kdgregory
I'm not using a threading package. That's part of the game. :) For now, it needs to know its thread ID, but later it'll need a few more pieces of information. Each method has a statically known maximum frame size (counting the evaluation stack), and each call checks the available remaining stack so it can crash but it won't run past the end.
280Z28
Huh? So you're writing your own threading package that's independent of whatever your OS (if any) provides? In that case, it should be a simple matter of taking the current SP (or perhaps SS) and identifying which thread owns that stack. You'll need to maintain that information somewhere in order to schedule your threads.
kdgregory
That goes in line with the second idea I posted. Other things I have to store are the thread name, the culture, exception information, and allocator information since I'd like to use a multi-threaded allocator like Hoard (http://www.hoard.org/).
280Z28
+1  A: 

Threading libraries on at least two operating systems that I know of use a segment register to store the pointer to thread-specific data. Of course, this is mostly an x86 mechanism, but since you said your stack pointer is ESP, then I'll bet that's the architecture you're using. Anyway, compilers don't typically generate code that uses the segment registers, and the OS you are using probably saves them on a context swap. Pthreads on BSD uses the %fs register.

boiler96