The case:
- A scheduled task running on Windows.
- The program's job is to extract some information from text documents. It uses a lot of regular expressions to do this. The program is therefore CPU Bound.
- The service takes every document it finds in a folder and converts them one after another.
Normally, everything is OK, the service finds a few documents every now and then and converts them.
But on some occasions, due to a misconfiguration on the system that delivers the documents, a few thousand documents has been in queue for conversion at the same time. When this happens, the service process stays at 100% CPU for a long time, causing the server to become unresponsive to other types of requests.
On Linux, I could probably use nice
or cpulimit
to limit the CPU usage, but all customers run this on Windows.
There are obvious ways I can work around this, for instance I could let the service sleep a bit between every file it handles (but this feels a bit clunky). Or I could let the extraction process sleep a bit between every regexp match (feels even clunkier).
Questions:
- Why does the application developer (me) even need to worry about this? I thought that assigning resources to processes was the operating system's job? I mean, seriously: Do I really need to make my program slower by design (by introducing some sleep() calls)?
- Is there a way I can configure Windows to give my scheduled task some CPU limit?
- Is this a general problem with CPU bound programs? (or am I doing something fundamentally wrong)
Edit:
- Some answers and comments suggest that I should check my assumption that my process is CPU bound. And that I/O load might be the problem. I want to check on the I/O load, is there a something simple I can look for in the task manager? Would a quickly increasing I/O Reads/Writes values be an indication?