OSDev.org

Posted: **Tue Jan 17, 2006 5:51 am**

Hi all, ive been looking for a free and simple source code counter (LOC) for C and assembler but cant find anything that really suits or works. I use eclipse as my IDE so an eclipse plugin would be great but the only one I can find is for Java code only so doesnt suit.

What are people here using, if at all.

Cheers.

Posted: **Tue Jan 17, 2006 6:28 am**

spacedsteve wrote: What are people here using, if at all.

Well... for counting lines of code I usually do:

Code: Select all

for i in `for j in \.c \.h \.cc; do find ./ | grep $j; done`; do cat $i; done | wc

It seems to work and it's cheap. If you want more filetypes you can add them to the inner for loop. Not checked for typo's btw.

Posted: **Tue Jan 17, 2006 6:35 am**

;D

Candy, you're counting words, not lines - and a bit complicated, too.

Code: Select all

find . -name "*\.*pp" -type f | xargs cat | wc -l

Adjust the regular expression to filter for the file types you look for. (The above crude example is not really precise, but good enough for my C++ uses. I loathe regexps.

)

PS: The above is checked (and corrected) for typos.

Posted: **Tue Jan 17, 2006 6:47 am**

Solar wrote: ;D

Candy, you're counting words, not lines - and a bit complicated, too.

The wc command I use counts words, lines and characters by default. Never realised it would be wrong. Still, it does probably work

Posted: **Tue Jan 17, 2006 7:09 am**

You're right... three numbers... see, it's me who didn't know his tools correctly. I think I never used wc without '-l' before...

Posted: **Tue Jan 17, 2006 9:05 am**

At work, I use Construx Surveyor (www.construx.com). (If you've ever read "Code Complete" or "Rapid Development", Construx is the author's company.) On Windows, it uses Python to populate an Excel PivotTable (scary sounding, I know). I'm not sure if there is a non-Windows version.

What I like about it is that it standardizes some of the metrics that you tend to care about the most when doing software estimation, and the output makes it easy to break down line counts by directory (which usually means by subsystem). This is a great tool if you're analyzing the size of past projects to try and predict the size of future ones. Proper estimation is the bat with which to keep the pointy-haired manager types at bay.

Posted: **Tue Jan 17, 2006 11:03 am**

Do you want line counter, or source code analizer?

The difference:

Code: Select all

a = 
    b + c;

Counter: 2 lines;
Analizer: 1 line;

Code: Select all

a = b + c;

Counter: 1 line;
Analizer: 1 line;

Posted: **Tue Jan 17, 2006 11:58 am**

If you start worrying about that kind of difference, you are in deep trouble in my experience. Line count is a very, very poor method of software "estimation". Do a line count for a quick & rough overview ("how much is this stuff?"), not more. If you are in a company that actually cares about your line count, run.

Posted: **Tue Jan 17, 2006 12:14 pm**

Solar wrote: If you are in a company that actually cares about your line count, run.

Or press enter randomly while typing for increased performance ratings

Posted: **Tue Jan 17, 2006 12:19 pm**

@Solar:
I agree with you, but it is interesting to see the difference.

@Candy:
Or write comments without any sense

Posted: **Tue Jan 17, 2006 12:21 pm**

I would be looking for a physical source line of code (SLOC) counter (which will not count things like comments and blank lines, or at least count them seperatly). I dont have a need for a logical SLOC counter (counts # of actual statements).

Posted: **Tue Jan 17, 2006 12:32 pm**

You should try to be creative with the command line sometime soon. It's really much fun and you can make really complex functions in just a command line. The output of my command, if you leave off the wc, is the entire source. Parse that through something that first filters out comments and then something that filters out empty lines. Comments exist in two variations in C: the // one and the /* */ one. The first is quite doable with sed:

sed -e 's/\/\/.*$//g'

This means:

sed -e '...' -> run sed on the stream with the command between the 's
s/ -> substitute
\/\/ -> two escaped slashes (read as two slashes)
.* -> anything, except for a newline
$ -> end of line
/ -> with
-> nothing
/g -> for as many times as necessary in this line

This doesn't keep into account continuing comments using a \, but imo using the \ with comments is a really really bad programming style (and I've had a nasty bug involving that already. Yes, I wrote the error myself).

The second is harder (as is // with \-support) because you need state information between lines. Try learning AWK for that, it's very good at keeping that sort of information.

Then, removing empty lines. That's pretty doable, but you have to figure out how to get the idea:

grep .

This means, print only lines (grep) that contain anything ( . )

Enjoy your new days in the shell, figuring out this sort of stuff.

This is exactly the sort of stuff that you can not do with a graphical user interface. Take that, Lisa/Macintosh/Windows/X!

--edit: ooh! my (2^11)-1'th post! That's the first number that could be a mersenne prime but isn't!

Posted: **Tue Jan 17, 2006 3:52 pm**

Solar wrote: Line count is a very, very poor method of software "estimation". Do a line count for a quick & rough overview ("how much is this stuff?"), not more. If you are in a company that actually cares about your line count, run.

I disagree. Line count is a very good proxy measure for the complexity of software, in the same way that the weight of individual parts is a good proxy measure for the cost of building an aircraft. The important thing is in how you use the line count (i.e. -- don't use it as a measure of personal productivity -- that's a morale killer, and a bogus source of data anyway). In my experience, developers produce much more accurate line count estimates for a new component than time estimates (i.e. -- "It should be about 500 lines" ends up being more accurate than "It will take me a week" because developers often forget to include important tasks when estimating).

There are estimation tools out there that can produce effort and schedule estimates from size inputs using your own historical data for calibration. This means that if it took your team six months to design and build a product that was written in 50000 lines of C++, and you're trying to estimate something of similiar size to be written in the same language by the same team, it will probably take the same amount of time (give or take).

I recommend reading the chapter on estimation in "Rapid Development". It's a real eye-opener.

Posted: **Tue Jan 17, 2006 9:53 pm**

I'm fond of 'sloccount'
http://www.dwheeler.com/sloccount/

-Erik

Posted: **Tue Jan 17, 2006 11:57 pm**

Colonel Kernel wrote: I disagree.

You're welcome.

I won't try to convince you, but I felt like elaborating on my POV.

Line count is a very good proxy measure for the complexity of software, in the same way that the weight of individual parts is a good proxy measure for the cost of building an aircraft.

As long as you keep in mind that a F-22 isn't the same thing as a DC-3. And that's where the problem is: Line count is not comparable between projects unless it's the same team working in the same general business area on a problem of comparative complexity under comparable conditions.

I recommend reading the chapter on estimation in "Rapid Development". It's a real eye-opener.

I know quite something about software estimation, and all I've read so far makes me uneasy. I keep getting the feeling all these "metrices" are basically for people who don't understand the job (i.e., PHB types). The best example being not counting comments. They take time to write, they add to the quality of the software, yet still virtually everybody insists on not counting them. As for blank lines, use a source reformatter to get your sources into a known format (you should do that anyhow), and you can even compare blank lines.

I don't really care much what someone wrote about line count. I'm aware that I've been working on F-22 type software and DC-3 software, working in a crack team and working in a team of people who never should have touched a compiler. I've worked in a happy little software house and I've worked in a big corp where the word "outsourcing" was uttered twice a day. I've worked in Visual Studio and I've worked with little more than vim and gcc, worked on C++ source that was "C with classes" and on C++ source that was thick with templates and multiple inheritance.

Line count is one factor. Many, many others come into the equation. As such, I consider anything beyond that "find"-statement above overkill.

OSDev.org

C Source Code Counter

C Source Code Counter

Re:C Source Code Counter

Re:C Source Code Counter

Re:C Source Code Counter

Re:C Source Code Counter

Re:C Source Code Counter

Re:C Source Code Counter

Re:C Source Code Counter

Re:C Source Code Counter

Re:C Source Code Counter

Re:C Source Code Counter

Re:C Source Code Counter

Re:C Source Code Counter

Re:C Source Code Counter

Re:C Source Code Counter