Let's say we have a system like this:
______
{ application instances ---network--- (______)
{ application instances ---network--- | |
requests ---> load balancer { application instances ---network--- | data |
{ application instances ---network--- | base |
{ application instances ---network--- \______/
A request comes in, a load balancer sends it to an application server instance, and the app server instances talk to a database (elsewhere on the LAN). The application instances can either be separate processes or separate threads. Just to cover all the bases, let's say there are several identical processes, each with a pool of identical application service threads.
If the database is performing slowly, or the network gets bogged down, clearly the throughput of request servicing is going to get worse.
Now, in all my pre-Python experience, this would be accompanied by a corresponding drop in CPU usage by the application instances -- they'd be spending more time blocking on I/O and less time doing CPU-intensive things.
However, I'm being told that with Python, this is not the case -- under certain Python circumstances, this situation can cause Python's CPU usage to go up, perhaps all the way to 100%. Something about the Global Interpreter Lock and the multiple threads supposedly causes Python to spend all its time switching between threads, checking to see if any of them have an answer yet from the database. "Hence the rise in single-process event-driven libraries of late."
Is that correct? Do Python application service threads actually use more CPU when their I/O latency increases?