views:

205

answers:

4

While compiling this morning I had a thgought.

Given a dedicated Linux machine (running Fedora for example), users remotely log in and compile (using gcc) their c++ software, which is stored on their own machines (on a small LAN), linked with symbolic links, to the Linux box.

Assume that each user is compiling exaclty the same code for now... One user can compile and link his code in 10 minutes.

Will it take 2 users 20 minutes in total to compile at the same time what about 3, or 10 users?

Is there an overhead involved that gives diminishing returns as users increase?

As a bonus question - What tips do you have for increasing compiling efficiency in this setup?

A: 

Depending on the size of the source for the projects, a saving might be to copy all files locally to the build machine before compiling. If the compiler has to pull all files over the network as it needs them this will introduce some overhead as network access is a lot lot slower than disk access.

If you wrote a script or used a tool that would only copy modified files to the build machine, then the overhead would be reduced significantly. In this case, the build machine would basically keep a local mirror of the source files and each time you compile, it would update any modified files, then compile. Obviously, if you have lots of users, and/or large projects files, then you run into storage / space issues.

xan
A: 

There is always an overhead involved due to:

  • scheduling needs
  • time conflicting I/O operations

The last one will be the most important one for you, as network access is severely slower than for example disk access. Pre-caching (first fetch all files locally, then start compilation) may help here. Already started builds will then not be hindered by new concurrent users.

ypnos
So a build which was pre-cahced then started will not be affected by a new build connecting, precaching then starting concurrently?
Krakkos
It will not be severely affected by the new build's precaching. It will, however, be affected by CPU usage of the new build. The idea is that CPU usage doesn't drop because of IO wait.
ypnos
+2  A: 

I suggest distcc.

Joachim Sauer
also ccache may be useful.
Lars Wirzenius
I'm using distcc3 for compilation. Until my new dev workstation arrives I'm working out of a full screen VM on my windows box. In order to speed things up I installed distcc on the two P4 Dell Optiplexes under my desk that otherwise sit idle most of the time and I'm very happy with the results.
MrEvil
A: 

Compilation is mostly CPU limited, so assuming you have enough RAM, you can expect the amount of compilation time to be roughly (time per task) * (number of tasks) / (number of CPUs/Cores in the system). (Curiously I ran 'make -j' on a 3 core system on a project of mine and had greater than 3x speed up, so there maybe some kind of blocking issues that prevents sequential make from running at full speed.)

Why don't the users compile their programs on their own computers?

TrayMan
we write multiplatform code, so we build locally using windows, and only compile on the target linux machine when we each have a build ready to deploy/test
Krakkos