views:

1256

answers:

3

What is the "critical section" of a thread (in Python)?

A thread enters the critical section by calling the acquire() method, which can either be blocking or non-blocking. A thread exits the critical section, by calling the release() method.

- Understanding Threading in Python, Linux Gazette

Also, what is the purpose of a lock?

+7  A: 

A critical section of code is one that can only be executed by one thread at a time. Take a chat server for instance. If you have a thread for each connection (i.e., each end user), one "critical section" is the spooling code (sending an incoming message to all the clients). If more than one thread tries to spool a message at once, you'll get BfrIToS mANtwD PIoEmesCEsaSges intertwined, which is obviously no good at all.

A lock is something that can be used to synchronize access to a critical section (or resources in general). In our chat server example, the lock is like a locked room with a typewriter in it. If one thread is in there (to type a message out), no other thread can get into the room. Once the first thread is done, he unlocks the room and leaves. Then another thread can go in the room (locking it). "Aquiring" the lock just means "I get the room."

zenazn
-1 for spreading a very wrong and bad design choice: the ugly terrible *one thread per connection* approach that is common but wrong.
nosklo
Try telling that to the Erlang guys. It may be wrong in many programming languages, but since it's so common (and provides such a useful example), I decided to go with it. In a question about connection pooling, I would have said something else :)
zenazn
A: 

A "critical section" is a chunk of code in which, for correctness, it is necessary to ensure that only one thread of control can be in that section at a time. In general, you need a critical section to contain references that write values into memory that can be shared among more than one concurrent process.

Charlie Martin
A newbie could be confused by your advice. It sounds like you are saying it's ok to read from memory that is shared by multiple threads without locking, which of course it isn't (unless you are sure your writes are atomic).
mhenry1384
I'm not sure we're disagreeing. The reads are fine *as long as* the writes are atomic. Everyone can read a const with no need for a critical section.
Charlie Martin
+3  A: 

Other people have given very nice definitions. Here's the classic example:

import threading
account_balance = 0 # The "resource" that zenazn mentions.
account_balance_lock = threading.Lock()

def change_account_balance(delta):
    global account_balance
    with account_balance_lock:
        # Critical section is within this block.
        account_balance += delta

Let's say that the += operator consists of three subcomponents:

  • Read the current value
  • Add the RHS to that value
  • Write the accumulated value back to the LHS (technically bind it in Python terms)

If you don't have the with account_balance_lock statement and you execute two change_account_balance calls in parallel you can end up interleaving the three subcomponent operations in a hazardous manner. Let's say you simultaneously call change_account_balance(100) (AKA pos) and change_account_balance(-100) (AKA neg). This could happen:

pos = threading.Thread(target=change_account_balance, args=[100])
neg = threading.Thread(target=change_account_balance, args=[-100])
pos.start(), neg.start()
  • pos: read current value -> 0
  • neg: read current value -> 0
  • pos: add current value to read value -> 100
  • neg: add current value to read value -> -100
  • pos: write current value -> account_balance = 100
  • neg: write current value -> account_balance = -100

Because you didn't force the operations to happen in discrete chunks you can have three possible outcomes (-100, 0, 100).

The with [lock] statement is a single, indivisible operation that says, "Let me be the only thread executing this block of code. If something else is executing, it's cool -- I'll wait." This ensures that the updates to the account_balance are "thread-safe" (parallelism-safe).

Note: There is a caveat to this schema: you have to remember to acquire the account_balance_lock (via with) every time you want to manipulate the account_balance for the code to remain thread-safe. There are ways to make this less fragile, but that's the answer to a whole other question.

Edit: In retrospect, it's probably important to mention that the with statement implicitly calls a blocking acquire on the lock -- this is the "I'll wait" part of the above thread dialog. In contrast, a non-blocking acquire says, "If I can't acquire the lock right away, let me know," and then relies on you to check whether you got the lock or not.

import logging # This module is thread safe.
import threading

LOCK = threading.Lock()

def run():
    if LOCK.acquire(False): # Non-blocking -- return whether we got it
        logging.info('Got the lock!')
        LOCK.release()
    else:
        logging.info("Couldn't get the lock. Maybe next time")

logging.basicConfig(level=logging.INFO)
threads = [threading.Thread(target=run) for i in range(100)]
for thread in threads:
   thread.start()

I also want to add that the lock's primary purpose is to guarantee the atomicity of acquisition (the indivisibility of the acquire across threads), which a simple boolean flag will not guarantee. The semantics of atomic operations are probably also the content of another question.

cdleary