views:

173

answers:

5

I'm looking for real figures and experiences, please don't take this too subjectively:

While looking for something else, I happened on an interesting statement, which partially reads as follows:

[...]the national average is 9,000 lines of code per year per person.[...]

I write a lot of code, but not full-time. When I look back at my projects of the past year and I do a (very) rough count (counting only code lines, no comments or white lines) I come to about 19.000 for a year that make it into a project. If I can automate parts of that, I could deduct the profit in time and money.

For estimating time-saving for larger projects, I need averages. How many code lines does man write in a year, on average, in C# (or other language of choice)? And, looking at your own situation, would you consider your hand-written code could (partially) be automated and by what gain?

+5  A: 

18000 would average out to about 36 lines of code a day.

With just 36 lines of code a day, what's the problem? The problem is debugging and rewriting your code.

NOTHING you can do to automate coding will speed you up--in fact, anything you can automate probably shouldn't be coded because if you are automating the typing of some pattern in your code, it should be factored out.

Where you can save time is to be more careful about how you code. Get your project through QA a little faster--code in a more explicit, typesafe language and code more clearly.

Also making your code data driven and fully factored wherever possible, although it will reduce the LOC you ship, it will make everyone's life easier and the project ship faster.

Do not EVER automate code input--if you can, you're doing it wrong!

Another way to think about it--every line of code you create has to be debugged and maintained. Why are you trying to come up with ways to give everyone MORE work when you could just create fully factored code (The input of fully factored code cannot be automated--pretty much by definition).

Bill K
(I only spent few days/month coding, the daily avg is different) I think you have very interesting points against automation. Using data mapping auto-generation software saves me writing POCOs and the generic classes (S#arp architecture). Implementing INotifyPropertyChanged for each property could easily be automated. Andrew Hunt in Pragmatic Programmer? Says "automate where you can". Good automated code generation should save time, but should be tested in its own right.
Abel
Btw, I apparently was a bit sleepy yesterday, or you: 19000 / 36 = 527 days? ;-)
Abel
Yeah, I think the math seems really off. When I divide the LOC in a shipping product by the man-hours spent on it I tend to get something in the area of 1-4 lines per day. At any rate, removing boilerplate instead of automating it should be the primary goal--it's the difference between harming your project and helping it. I've almost always been able to find a way to get rid of it--for instance I really dislike access objects and will replace them with data structures or something similar. I just continually look for practices that allow me to write factored code.
Bill K
+3  A: 

This is the type of metric talked about in the Mythical Man Month. Estimating projects in Man-Days/Months/Years, or counting lines of code as a productivity metirc guarantees inaccuracy in reporting.

Mike Miller
Ah, my favorite book! Yes, I still love MMM. Any report, not just for it, has a factor of inaccuracy. Knowing the factor helps interpreting the report well.
Abel
+3  A: 

First, lines of code written don't correlate well with actual productivity. At least in my opinion, if you want to measure and/or estimate productivity, function points are a more effective measurement. Second, when a metric varies over a wide range, the average generally means very little. In a case like this, a geometric mean generally means more than an arithmetic mean, but without (at least) something about the variance/standard deviation, it still doesn't mean much.

I'd also note that there are some fairly sophisticated models that have undergone substantial research and even measured against real projects to get at least some idea that their results correlate with reality. For example, the COCOMO II model will generally produce much better results than just using lines of code per unit of time. There's at least one free online implementation (Edit: looking at it, this now allows either LoC or function point based modeling). There are also a tools such as SoftStar and Function Point Modeler) that combine a COCOMO-like model with function points to get what appear (at least to me) to be fairly solid results.

Jerry Coffin
Very interesting and well worded. Thanks. I especially like the pointer to Function Points, which I needed when PM'ing (part of) a huge project of the Dutch Government, where everything was measured in FP's (and it kinda worked: if you added a factor 2.4 to it, but that's a long story... ;-)
Abel
@Abel:A factor of 2.4 is actually quite a bit better than most methods even hope for. If you try to base things on lines of code, you'll generally be lucky to get much closer than a factor of 5, and being off by a factor of 10 isn't all that rare.
Jerry Coffin
Accepting your answer as it's the only one that really goes into the subject of using lines of code as a measurement for project planning or improvement gain evaluation. I understand that the practice is not necessarily good and can even be evil. In situations where one has to use it, or where better methods are still to be invented, knowing its shortcomings is a large benefit and a necessary evil.
Abel
A: 

That's a rather BS question, actually. Even SLOC devotees will admit to you that SLOC productivity estimates are only valid within similar environments. Not only does it vary by programming language, but by industry, development environment, application, etc.

In as much as SLOC numbers are worth anything, it is only within the same development team working on similar projects.

T.E.D.
I wouldn't agree calling a question BS the minute the answer is "don't do it because...". Good advice is often welcome. First time I fixed my tire, my dad got many BS questions of me. Gladly, he answered them and I was able to learn. I didn't know about SLOC, I learnt something.
Abel
What's worse, productivity is not necessarily directly related to LOC. Everyone has seen a fifty line function that someone wrote because they couldn't see the plain five line solution.
Wayne Conrad
+1  A: 

I believe the rate for LOC highly depends on the technical debt in the project.

I have a project (SQL) which is 27KLOC (plus 4K more for support). Working on this code, over 7 months, I added 3K net new LOC to the project, with about 14KLOC written just for throwaway testing (testing to isolate anomalies, not unit tests).

Depending on how you measure, I write 29KLOC/year ((3K+14K)/7months*12months) but produce only 5KLOC/year (3K/7months*12months).

Looking at code (27KLOC) as debt, we have code that generates 7% (2KLOC) in throwaway code monthly, or 88% (24KLOC) per annum.

Assuming I can continue to turn out a whole 29KLOC/year, and assuming the cost of maintaining code stays at 88%/annum, my personal project limit is 33K lines of code. Beyond which, I will spend all my time paying interest on my technical debt, writing throwaway code, and producing net zero LOC.

Lucky that the last 3KLOC was a refactoring, which should reduce my interest rate.

Kyle Lahnakoski
+1 interesting analysis, adds something to the discussion and my thought on the subject, thanks!
Abel