Since distcc cannot keep states and just possible to send jobs and headers and let those servers to use only the data just sent and preprocess and compile, I think the lastest distcc has problem in scalability.
In my local build environment which has appx. 10,000 c/c++ files to build, I could only make 2 times faster than not using distcc (but using make -j) when having 20 build servers.
What do you think is the problem?
If anyone has achieved scalability more than 10 - 20 times using make -j and distcc, please let me know.
The following product claims that it is impossible to scale make -j and distcc faster than 5 times. http://www.electric-cloud.com/products/electricaccelerator.php
I think this can be improved by:
- Letting the distccd server to maintain sessions
- Tied to those sessions, they will cache their own header directories
- Preprocess will be done demand base from the distccd server
- This will be done through a LD_PRELOADed library libdistcc.so which will replace stat/open syscalls and fetches the header files over network. ...
Has anyone done this kind of thing?