I'm trying to build a Tetris AI algorithm that can scale over multiple cores.
In my tests it turns out that using multiple threads is slower than using a single thread.
After some research I found that my threads spend most of their time waiting for _Lockit _Lock(_LOCK_DEBUG)
. Here's a screenshot.
As you can see, the lock is applied on a local variable, which shouldn't require any locking anyway!
My questions are:
- Why does STL lock this vector?
- How can I make my program faster? (Use arrays?)
Update
I eliminated the lock by setting these command line options in my Visual Studio projects:
/D "_HAS_ITERATOR_DEBUGGING=0" /D "_SECURE_SCL=0"
It's important to apply this to all projects in the solution file or errors will occur at runtime (conflicting iterators etc).
The second thing I changed was changing std::vector<bool>
into std::vector<char>
. I wasn't aware that std::vector<bool>
was so slow.