The wrong way to count lines of code

Started by
43 comments, last by Luckless 8 years, 5 months ago
Complexity of a project should be measured by the number of tears I shed upon finding duplicate JSON serialization classes that people have added to the project.
Advertisement
Does anyone actually (non-ironically) use lines-of-code as a metric for comparing projects?

The primary use of lines-of-code metrics is within a single project.

When a 10,000 lines code review comes across my desk, in a project with less than 50,000 lines of code, then I know it means trouble. If one engineer produced 5,000 lines of code last month and another produced only 500, while both adhering to the same coding guidelines, then I know that responsibility is unevenly distributed in the project.

This sort of thing is important to be aware of, not just for the pointy-haired, but also for the engineering leads.

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]

Actually in hindsight, surprised nobody mentioned this yet (myself included):

http://www.osnews.com/story/19266/WTFs_m

Don't pay much attention to "the hedgehog" in my nick, it's just because "Sik" was already taken =/ By the way, Sik is pronounced like seek, not like sick.

If one engineer produced 5,000 lines of code last month and another produced only 500, while both adhering to the same coding guidelines, then I know that responsibility is unevenly distributed in the project.


If the guy who wrote 5000 lines was implementing a DLC system with an extremely well-written spec, and the guy writing 500 lines was integrating a third party library while having to deal with a poorly defined spec and was helping a different engineer with questions at the same time, they both might be handling their responsibilities perfectly.

Lines of code per unit time is completely meaningless.

If the guy who wrote 5000 lines was implementing a DLC system with an extremely well-written spec, and the guy writing 500 lines was integrating a third party library while having to deal with a poorly defined spec and was helping a different engineer with questions at the same time, they both might be handling their responsibilities perfectly.

I didn't make any value judgement about their relative performance. If one engineer is being given tasks that are well defined, while the other is slogging through a wasteland, then their responsibilities are unevenly distributed.

This is a management problem, not the engineering witch hunt people so readily assume. And without the right data, you can't fix management problems.

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]

If you're using LOC as a metric, why shouldn't you include comments?

It's documentation that someone took considerable time to write out, and needs to be factored into the project value. In fact I would weight comments as more value than the code itself for this reason, a 1000 line commented program is more useful to me than a 100000 line program with no comments...

If the guy who wrote 5000 lines was implementing a DLC system with an extremely well-written spec, and the guy writing 500 lines was integrating a third party library while having to deal with a poorly defined spec and was helping a different engineer with questions at the same time, they both might be handling their responsibilities perfectly.

I didn't make any value judgement about their relative performance. If one engineer is being given tasks that are well defined, while the other is slogging through a wasteland, then their responsibilities are unevenly distributed.

This is a management problem, not the engineering witch hunt people so readily assume. And without the right data, you can't fix management problems.


Both tasks are real, and I worked on both (so the engineer's skill is out of the equation, like you're saying). You're generally right that it's a management problem. However, I'm arguing that lines of code produced is utterly irrelevant and cannot be used by management.

I also think you need to clarify what you mean by "responsibilities are unevenly distributed". I'm responsible for estimating costing of tasks, implementing them, discussing specs, helping other engineers with questions, addressing bugs, fixing source control screw-ups that non-coders occasionally make, discussing risks with my leads, performing code reviews, fixing bugs. But so is everyone else. I'm not responsible for anything that's accurately measurable by lines of code. Many implementations are better when they're shorter and simpler (especially in a game).

If you're talking about uneven task load balancing, then yeah, it's always uneven. Nobody can accurately estimate a cost for a system they've never implemented before, and the majority of the time we're implementing new things. Everything that has been done before is handled by the engine, or by integrating a shared library another team wrote - those are easy to cost, and easy to deal with. It's the new things that are SNAFUs.

From a management perspective, you might have a dozen engineers. Some of them are better than the others - just a fact of life. When you get a risky task, you find out who you can trust with that task. But sometimes all of your high-caliber engineers are already busy for the foreseeable future, and you've got another scary task that MUST be implemented - you can't cut this one, and you can't cut anything else either. You have to give it to someone. You can't hire a new high-caliber engineer, you can't get one from another team, so you're forced to either wait until a high-caliber engineer frees up, or give it to someone who's not-as-high-caliber. This is one form of "biting off more than you can chew" as a team, but honestly, when does a team ever NOT bite off more than they can chew?

None of this - NONE of it - is reflected in lines of code. It's reflected in the difference between the estimated cost in time and the actual cost in time. It's reflected in the stressed-out faces the team makes in all-hands meetings. It's reflected by engineers getting frustrated that a few of their peers seem to be a net detriment to the team (either because they're slacking, continuously make mistakes, over-engineering ostensibly simple code, attempting to optimize things that don't need optimizing, or just don't know how to do their job). It's reflected by people quitting.

You can't even use lines of code to compare an identical task implemented by two different people (as if you'd ever want to do that... but still). One engineer may implement the system incredibly quickly, produce 2000 LOC, performs like a champ, but it's an unmaintainable mess. Another engineer may take twice as long, produce 1500 LOC, it's extremely well documented and maintainable, but doesn't perform as well as it needs to. The behavior of code, whether it's maintainable or not, how well it performs its task requires a VERY thorough code review, not counting lines of code.

If one engineer produced 5,000 lines of code last month and another produced only 500, while both adhering to the same coding guidelines, then I know that responsibility is unevenly distributed in the project.

Or you know that the 5k LOC guy is a junior programmer churning out crap, and the other guy is doing his job properly laugh.png

Some programs do measure statements (uh, "semicolons") rather than LOC, but both aren't really indicative of code-quality.

A productive day is one where I can delete more lines of code than I write. smile.png

SourceMonitor tells me I have 108k LOC, of which 57k are actual code statements.

That makes me feel good (yesh, broke 100k! \o/), but doesn't give me any relevant information to improving the project or comparing it quality- or complexity-wise to other projects.

However, when SourceMonitor tells me that:

  • My average methods per class is 7.94
  • My average statements per method is 4.5
  • My max scope depth is 8, my average depth is 0.98
  • My average "complexity" is 2.08 (??), and my max "complexity" is 86 (!!!)

...that actually begins to tell me important details of my project. Plus it keeps track of it from the last time I scanned it, nearly a year ago, so I can see in what ways the code improved and where it has gotten worse.

Add in HeaderHero and other such tools, and you can use the relevant information to improve your project.

That said, it still feels nice to know I broke 100k. ph34r.png

Total number of lines in Sol (which was already released): 41536 (among all source and header files). That being a full blown engine and the game logic, and yes that's including even blank lines. You're making me worried now.

Don't pay much attention to "the hedgehog" in my nick, it's just because "Sik" was already taken =/ By the way, Sik is pronounced like seek, not like sick.

This topic is closed to new replies.

Advertisement