Page 2 of 2

Re: C Header File conventions

Posted: Mon Oct 24, 2022 9:50 am
by nullplan
eekee wrote:Some years ago I read that slow compilation was considered one of C++'s bigger problems, and that re-including header files was part of the reason it was so slow. Guards don't really help with this; the preprocessor still has to parse the entire file every time its loaded to find the correct #endif. I assume #pragma once is an attempt to deal with this speed issue. That's the only positive thing I have to say about it, anyway.
Except that modern compilers implement the #pragma once optimization even for normal include guards. They simply do not open the file again if the include guard symbol is already defined.

That is also not the biggest problem with re-reading header files. In both C and C++, each source file is compiled separately to an object file, with the result that many header files are read at least once for each source file. Even with the above optimization, a typical compilation process of a large and complex project ends up reading the same file hundreds of times.

The only way I have seen other languages not repeat that design mistake is to forego the object file phase at all. C# for example simply does not have them. If you have a large project consisting of multiple source files, you always hand all of them to the compiler. That means the compiler can keep the information about what module contains what class with what methods in memory, and does not have to re-read it.

Re: C Header File conventions

Posted: Wed Oct 26, 2022 7:13 am
by Solar
eekee wrote:Some years ago I read that slow compilation was considered one of C++'s bigger problems, and that re-including header files was part of the reason it was so slow. Guards don't really help with this; the preprocessor still has to parse the entire file every time its loaded to find the correct #endif.
That must have been many years ago. I correct my comment from above: GCC has been optimizing header-guards to not require re-reading the header file since before version 2.95. Actually, in the 2.95.3 manual you will find this (emphasis mine):
There is also an explicit directive to tell the preprocessor that it need not include a file more than once. This is called `#pragma once', and was used in addition to the `#ifndef' conditional around the contents of the header file. `#pragma once' is now obsolete and should not be used at all.
That was 1999. I would not be at all surprised if that statement had been in there even longer. (See the very bottom of this post though, there is a problem specific to C++ here, and that is templates.)
nullplan wrote:That is also not the biggest problem with re-reading header files. In both C and C++, each source file is compiled separately to an object file, with the result that many header files are read at least once for each source file. Even with the above optimization, a typical compilation process of a large and complex project ends up reading the same file hundreds of times.
Yes, but only if you re-build it from scratch. After that, a change to a source file should only require re-translating that one translation unit (reading the headers only once), and re-linking the object files (which is comparatively quick).
nullplan wrote:The only way I have seen other languages not repeat that design mistake...
It wasn't, really. It was a quite clever solution to the problems of limited RAM in older computers. You simply could not hold the compiler executable, all the sources of a project, and all the data created from those sources in memory at once. So you went through the process piecemeal, one translation unit at the time, saving the intermediary object code.

And it is not so much the reading of the files that takes time, but the parsing of them. So one way to speed up the process are pre-compiled headers. For those new to the concept, the idea is to have the compiler pre-parse your headers, and cache the data structures for re-use with the next translation unit.
nullplan wrote:C# for example simply does not have them. If you have a large project consisting of multiple source files, you always hand all of them to the compiler. That means the compiler can keep the information about what module contains what class with what methods in memory, and does not have to re-read it.
What a C# compiler does is working from the assumption that all that data will fit into memory (as it will these days), and do the equivalent of what a precompiled C/C++ header does: Keeping the information gained by compiling the first translation unit (MS speak: project) for the next.

Takes much longer for that first translation unit, because instead of having to parse only the header file of each class used, C# (and Java) have to parse the whole source for that class to e.g. check function prototypes etc.

----

All that being said in defence of the C/C++ headers, there is the huge problem of C++ template source, which does have a massively negative effect on compilation times if you're looking at complex code. But that subject would take the thread too far off-topic.

Re: C Header File conventions

Posted: Thu Oct 27, 2022 12:55 pm
by eekee
Solar wrote:
eekee wrote:Some years ago I read that slow compilation was considered one of C++'s bigger problems, and that re-including header files was part of the reason it was so slow. Guards don't really help with this; the preprocessor still has to parse the entire file every time its loaded to find the correct #endif.
That must have been many years ago. I correct my comment from above: GCC has been optimizing header-guards to not require re-reading the header file since before version 2.95. Actually, in the 2.95.3 manual you will find this (emphasis mine):
There is also an explicit directive to tell the preprocessor that it need not include a file more than once. This is called `#pragma once', and was used in addition to the `#ifndef' conditional around the contents of the header file. `#pragma once' is now obsolete and should not be used at all.
That was 1999. I would not be at all surprised if that statement had been in there even longer. (See the very bottom of this post though, there is a problem specific to C++ here, and that is templates.)
The article I recalled must have been about templates. I evidently misunderstood it.

Re: C Header File conventions

Posted: Fri Oct 28, 2022 1:52 am
by Solar
eekee wrote:The article I recalled must have been about templates. I evidently misunderstood it.
Easy to confuse the two if you're just skimming through an article. C generally parses so easily, and C headers contain so little actual logic, that they parse very quickly. Templates, on the other hand, are in themselves turing-complete, so there is no limit to how complex parsing them can become.

An issue that the C++ community is well aware of, with modules being the attempted solution. They are still a very new feature (to the point where I myself don't have any actual experience with them), but from what I hear they are coming about quite nicely, and align closely with how things are handled by e.g. C# and Java. With C++23, the whole standard libary will be available via module ("import std;"), which indicates there is some way to efficiently handle templates through modules.

Which might actually put that evil construct that is "#include <bits/stdc++.h>" to rest. 8)