views:

85

answers:

2

Hi.

I'm having some trouble with a program using pthreads, where occassional crashes occur, that could be related to how the threads operate on data

So I have some basic questions about how to program using threads, and memory layout:

Assume that a public class function performs some operations on some strings, and returns the result as a string. The prototype of the function could be like this:

std::string SomeClass::somefunc(const std::string &strOne, const std::string &strTwo)
{
 //Error checking of strings have been omitted
 std::string result = strOne.substr(0,5) + strTwo.substr(0,5);
 return result;
}
  1. Is it correct to assume that strings, being dynamic, are stored on the heap, but that a reference to the string is allocated on the stack at runtime?

Stack: [Some mem addr] pointer address to where the string is on the heap

Heap: [Some mem addr] memory allocated for the initial string which may grow or shrink

To make the function thread safe, the function is extended with the following mutex (which is declared as private in the "SomeClass") locking:

std::string SomeClass::somefunc(const std::string &strOne, const std::string &strTwo)
{
 pthread_mutex_lock(&someclasslock);

 //Error checking of strings have been omitted
 std::string result = strOne.substr(0,5) + strTwo.substr(0,5);

 pthread_mutex_unlock(&someclasslock); 

 return result;
}
  1. Is this a safe way of locking down the operations being done on the strings (all three), or could a thread be stopped by the scheduler in the following cases, which I'd assume would mess up the intended logic:

    a. Right after the function is called, and the parameters: strOne & strTwo have been set in the two reference pointers that the function has on the stack, the scheduler takes away processing time for the thread and lets a new thread in, which overwrites the reference pointers to the function, which then again gets stopped by the scheduler, letting the first thread back in?

    b. Can the same occur with the "result" string: the first string builds the result, unlocks the mutex, but before returning the scheduler lets in another thread which performs all of it's work, overwriting the result etc.

Or are the reference parameters / result string being pushed onto the stack while another thread is doing performing it's task?

  1. Is the safe / correct way of doing this in threads, and "returning" a result, to pass a reference to a string that will be filled with the result instead:

    void SomeClass::somefunc(const std::string &strOne, const std::string &strTwo, std::string result) { pthread_mutex_lock(&someclasslock);

    //Error checking of strings have been omitted result = strOne.substr(0,5) + strTwo.substr(0,5);

    pthread_mutex_unlock(&someclasslock); }

The intended logic is that several objects of the "SomeClass" class creates new threads and passes objects of themselves as parameters, and then calls the function: "someFunc":

int SomeClass::startNewThread()
{

 pthread_attr_t attr;
 pthread_t pThreadID;

 if(pthread_attr_init(&attr) != 0)
  return -1;

 if(pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED) != 0)
  return -2;

 if(pthread_create(&pThreadID, &attr, proxyThreadFunc, this) != 0)
  return -3;

 if(pthread_attr_destroy(&attr) != 0)
  return -4;

 return 0;
}

void* proxyThreadFunc(void* someClassObjPtr)
{
 return static_cast<SomeClass*> (someClassObjPtr)->somefunc("long string","long string");
}

Sorry for the long description. But I hope the questions and intended purpose is clear, if not let me know and I'll elaborate.

Best regards. Chris

A: 

General advice: Try to minimize the places where access to shared data can happen. By shared data I mean data that can be accessed at any time by any thread.

There are some general ways of approaching multi-thread programming:

Producer consumer

Reader writers

There are of course other ways but these two are the most used - at least by me (especially the first one).

Iulian Şerbănoiu
+1  A: 

1 a/b: No, neither one can happen. The parameters to functions and their return values are located on the stack and each thread has his own stack. However, other things can certainly go wrong:

  • one of the string operations could throw an exception, preventing someclasslock from ever getting unlocked and your application will hang.
  • assuming that the strings that are passed into the function are shared between threads (if they are not, the lock is unnecessary), another thread could call the destructor on them just after the function is called and before the lock is acquired. In that case the string operation would lead to undefined behavior.

I recommend that you create a new SomeClass object for each thread. In that case, all the members of these objects are accessed only by one thread and don't need to be protected by a lock. Drawback would be that you can't access them from your main thread anymore after starting the new thread. If thats required, then you do have to protect them with a lock (the lock would be a member of that class too).

Having said that, the function somefunc does not seem to affect any members of the object at all, and therefore doesn't need protection. Think about the granularity of sharing between threads, it looks to me that the protective lock should be in the function that is calling somefunc.

Thank you, both of you for taking the time to reply.
ChrisCphDK