Single source file

Programming, for all ages and all languages.
Post Reply
Antti
Member
Member
Posts: 923
Joined: Thu Jul 05, 2012 5:12 am
Location: Finland

Single source file

Post by Antti »

I am not very good at creating successful threads. I may follow the same pattern here but I will try and see what happens. I do not claim the idea here is my own or even worth considering as an idea.

The main idea is simple: to get rid of the concept of having source code divided into many files. I will use a program written in C as an example. We have seen many sample programs that fit into one source code file.

Code: Select all

/* program.c */

#include <stdio.h>

int main()
{
	printf("hello, world\n");
	return 0;
}
If we distributed this program in the form of source code, it would be very easy for end-users to have a single file. Building instructions for Unix-like systems:

Code: Select all

cc -o program program.c
No configure scripts. No makefiles. No platform-specific build hacks. No object files. Just a source code file translated to an executable. If we use a compiler like TCC that supports running the source code like a script, we could consider program.c being an executable. When thinking about compilers, it could be possible to make more powerful optimizations because the compiler sees the whole program.

We all know that having a big chunk of code is quite hard to understand and maintain. It is easy to have a "hello, world" program packed into one file. What about any bigger programs? One solution to this could be to have a powerful "IDE-like" management system. Then the responsibity of having a tree-like (like source file hierarchy nowadays) view of the program is passed to the IDE. There could be same kind of helpers as #regions in C#. I think the size of code a developer should work on at a time is something that can be seen without scrolling the screen.

I am almost hearing the criticism that follows this idea. What about the compilation speed if we have to compile everything if making a small change in code? Answer: true, we do unnecessary code compilation but is that overhead significant today? If using normal "at least one hundred source files" method, we would have a significant overhead when doing several compilations. I would say the one file approach is almost always faster in practice. If the program is so big that the building overhead is significant, then it is time to divide the actual program. As a bad example: Microsoft Office is not a single program: it consist of Word, Excel, Powerpoint etc.

What do you think?
User avatar
bluemoon
Member
Member
Posts: 1761
Joined: Wed Dec 01, 2010 3:41 am
Location: Hong Kong

Re: Single source file

Post by bluemoon »

AFAIK sqllite pack everything into single file (in fact, one header and one cpp) as one of the option for distribution, which you can easily integrate into your own project.
User avatar
Love4Boobies
Member
Member
Posts: 2111
Joined: Fri Mar 07, 2008 5:36 pm
Location: Bucharest, Romania

Re: Single source file

Post by Love4Boobies »

In part, you misunderstand the purpose of having multiple source files and fail to see where the complexity of build systems actually comes from, which leads you to design a terrible solution. Here are some of the reasons for having multiple source files:
  • It makes it easy to find something in the source tree, even when you're not sure what it's called.
  • When you decide to remove some functionality, you are less likely to leave dead code around.
  • It enables several developers to work on the same code base at the same time.
  • It makes it possible to build different parts of the code base with different compiler and/or linker options.
  • As you've already mentioned, it can help build systems achieve faster build times for projects of significant size.
  • It makes it possible to reuse code.
Because of this misunderstanding, you seem to think having multiple source files is what requires projects to have configure scripts, makefiles, and platform-specific hacks. First of all, configure scripts help avoid platform-specific hacks, by which I believe you mean the kind of hacks one uses to make the build system work with multiple ports of some utility (e.g., GNU Make vs. BSD make). So, right from the start, you need to remove either one or the other from your list. Next, for the important part: It is not multiple source files that requires these things; it's the other way around.

As for build systems, they become difficult to maintain due to the fact that they need to keep track of dependencies.

Equipped with this knowledge, we can now see what problem we're trying to solve. We don't have to give up the advantages of using multiple source files; we just need to simplify the build system, as the former does not require the latter. If you are willing to compromise fast build times, all you need is a very small script (personally, I'd use several for modularity's sake) which compiles all of the source files and links them together.

There's an important lesson to be learned from this thread. It is important to think long and hard about the problem you are trying to solve and then not to stop as soon as you've found the first solution that came to you. Otherwise, all your projects will be crap.
"Computers in the future may weigh no more than 1.5 tons.", Popular Mechanics (1949)
[ Project UDI ]
Antti
Member
Member
Posts: 923
Joined: Thu Jul 05, 2012 5:12 am
Location: Finland

Re: Single source file

Post by Antti »

I have a couple of projects and I use multiple source code files and I am aware of almost all of the advantages you listed. I am not saying this "one file way" is a solution I am going to implement right away. Also, I am a not saying this is the best solution. I am trying to find pros and cons. Mostly pros so far.

I think we all agree how easy this is:

Code: Select all

cc -o program program.c
What about the distribution of the program?

Code: Select all

PROGRAM.ZIP
  -> PROGRAM.C
  -> README.TXT                (optional)
How complex a program must be so that it is not possible to maintain this simplicity? Is there a definition for it? Sharing work with several developers is one thing I do not have an answer just yet.
Love4Boobies wrote:It makes it possible to build different parts of the code base with different compiler and/or linker options.
If doing some special development, like OSs, all the switches and whistles are needed. What I am looking for is application programming. If we have to bother with linker options, then I think there is something wrong with the whole infrastructure. We could get to the point where programs do not care of such things.
Love4Boobies wrote:we just need to simplify the build system
I agree with you. Do we need a build system?


Warning! I am just trying to challenge current standards. In reality, I am not so enthusiastic about this.
Antti
Member
Member
Posts: 923
Joined: Thu Jul 05, 2012 5:12 am
Location: Finland

Re: Single source file

Post by Antti »

One other advantage of this whole thing: we would not need to have so much #includes. However, one other problem with C is the namespace scope. It is quite common to have "static global labels" like this:

Code: Select all

/* source1.c */

static int i;

static int do()
{
	return ++i;
}

static int it()
{
	return ++i;
}

int DoSomething()
{
	return i + do() + it();
}

Code: Select all

/* source2.c */

static int i;

static int do()
{
	return ++i;
}

static int it()
{
	return ++i;
}

int DoSomethingElse()
{
	return i + do() + it();
}
As we can see, those static labels can use short and reusable names. If we had just one source code file, then we would always need to have globally unique names. This makes it even more important to have a good naming convention. A real problem.

I have seen some long assembly code listings where code is arranged something like this:

Code: Select all

; -------------------------------------------------------------------
; Function: Do Something
;
; This function does nothing useful but we need it.

DoSomething:
	push eax
	pop eax
	ret

; End of Function
; -------------------------------------------------------------------

; -------------------------------------------------------------------
; Function: Do Something Else
;
; This function does nothing useful either but we need it.

DoSomethingElse:
	push eax
	pop eax
	ret

; End of Function
; -------------------------------------------------------------------
When a programmer starts writing code, the editor should only display one unit at a time. Then a programmer can concentrate on one maintainable set. That is just like when using files. However, this does not have a hierarchy. Perhaps that "function header" could contain more meta information about the function. That would then help the IDE to build a reasonable hierarchy tree.
User avatar
iansjack
Member
Member
Posts: 4685
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Single source file

Post by iansjack »

I actually find it simpler to type just

make
make install

Rather than your compile command (which doesn't even address the matter of installing the compiled program). A build system, rather than a simple compile command, is necessary IMO to allow the program to be compiled on different operating systems with different requirements. And once you accept the need for more than a simple compile command you might as well reap the benefits of multiple source files.

It is a simplification to suppose that a program of any complexity results in a single output file. Have a look at gcc for example. Do you think it would be possible to write that program with a single source file and a single output file? How do you allow for the possibility that one user might want to support just C, whilst another might want an Objective-C compiler? Do we have 2 source files to cover these two possibilities? And what about all the possible variations of gcc. You would have hundreds of slightly different single-source files.

Single-source files may be OK for trivial programs but once you want source that produces a complex program and can be compiled on a variety of systems, with a number of variations, a configure/build system is indispensable.
Antti
Member
Member
Posts: 923
Joined: Thu Jul 05, 2012 5:12 am
Location: Finland

Re: Single source file

Post by Antti »

It should be the OS that installs programs if any "installation" is needed. For example, if we had a graphical file browser and we then right-click on program.c (or any source code file), it could have two options:

Run
Install

That Run command would make a temporary executable and run it. The Install command would install it to the place OS sees is best for it. That could be more platform-independent. Of course, it is very common that configure-like scripts determine some preprocessor label values that are used for defining what features are enabled etc. I agree that current conventions do not very well fit to this one-file-no-scripts model.

I already said that large programs could be divided into smaller programs. The problem with two programs sharing vast amount of identical code really is a problem. What if we had a so-called library program that serves both cases? Then it is important to have good inter-process communication. Then your C compiler executable has C-specific things and your Objectibe-C compiler executable has Objective-C-specific things. They both would use a gcc-common executable that contains general code.

I really agree that code duplication is a problem and I do not have "one-size-fits-all" solution for it. It is almost impossible to convert existing projects to this "new" model. Everything must be taken into account from the very beginning.
User avatar
iansjack
Member
Member
Posts: 4685
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Single source file

Post by iansjack »

The Install command would install it to the place OS sees is best for it.
So I'm not allowed to determine where I would like a program installed? The nanny OS knows best? No thank you.
The problem with two programs sharing vast amount of identical code really is a problem. What if we had a so-called library program that serves both cases?
Well, isn't that exactly how things work in the real world? (Although often it results in more than one library program.) So when you want to build your program you need to ensure that any libraries that it depends upon are also built. This means dividing the program into separate components and then having some sort of tool to determine dependencies and compile components of the program in the correct order.

Wait a minute - this is starting to look familiar!
Antti
Member
Member
Posts: 923
Joined: Thu Jul 05, 2012 5:12 am
Location: Finland

Re: Single source file

Post by Antti »

iansjack wrote:So I'm not allowed to determine where I would like a program installed? The nanny OS knows best? No thank you.
It is not exactly like this. Now the program scripts handle the installation, e.g. copying files. There could also be an "Advanced Install" that asks from the user where a program is installed. The main point is that it is OS's responsibility to do all this. All the dialogs that ask from user what to do is provided by the OS. The default installation would be silent and it does effectively the same as "configure && make && make install".
iansjack wrote:Wait a minute - this is starting to look familiar!
It is still far from source-code-level dependencies tracking. The dependency tracking is done the same way as, for example, APT handles it. Possibly integrated in the OS itself. Of course, there should be a good way to define that "this program depends on these programs".
User avatar
iansjack
Member
Member
Posts: 4685
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Single source file

Post by iansjack »

This is beginning to get more convoluted with the operating system having to handle things that are currently handled perfectly well by development tools. If you are going to produce cross-platform source code then all operating systems will have to adapt to this model; currently you just have to adapt the specialized tools. It strikes me that it's better to make the tool work with the OS rather than forcing the OS to work with the tool. And all in an attempt to force a single-source approach in which I can see no benefit.

It reminds me rather of classical attempts to explain planetary motion all in terms of perfect circles. You can keep adding extra structure to support a fixed concept, but it still doesn't make an elegant, workable solution.
User avatar
dozniak
Member
Member
Posts: 723
Joined: Thu Jul 12, 2012 7:29 am
Location: Tallinn, Estonia

Re: Single source file

Post by dozniak »

There are some approaches that make it feasible.

For example, CodeBubbles IDE allows very easy contextual navigation around any source, it just doesn't care about what file it is in - as a programmer, you think in terms of methods and functions you call to get stuff done and navigate accordingly.

Oberon system keeps additional metadata in the source files, making them self-formatted objects with their own options.

Program can be stored as an AST, with no source code involved at all, while maintaining all necessary semantics information. With an appropriate IDE this becomes very easy to manipulate the program structure, regardless of number of files it is contained in.

As for the dead code elimination, it can very efficiently done by the compiler and linker, not the user. Currently GNU tools suck badly at this, but extensions like gold and lld make it more viable option.

Compiler can also be told to issue the warnings on unused code, so I don't see any problem with this.

One interesting question is how to properly manage platform-specific code in such cases; there should be a way to selectively choose a block of code for compilation.

All in all, I think your idea is going in the right direction, look beyond files!
Learn to read.
Antti
Member
Member
Posts: 923
Joined: Thu Jul 05, 2012 5:12 am
Location: Finland

Re: Single source file

Post by Antti »

If we use that complete "hello, world" program as an example. It runs almost everywhere and is easy to handle. Even a non-programmer can see the file (not the content) and understand "this is the source of code of this executable". The source code file itself does not care how it is compiled, run, or viewed. I can compile it manually or have an OS services to install it in the way an OS wants to handle programs. The program (source code file) does not know that.

Now, if we take a bigger program, the same simplicity could be there also. It does not make any difference (except compile time) to compile a source code file that has 100000 lines of code. That one file is the program. Whether that programs depends on other programs is a different thing. Even then, the dependency should be more "loosely coupled".
iansjack wrote:to handle things that are currently handled perfectly well by development tools.
I do not think installing programs is perfectly handled in general. In Unix-like world (GNU way), we have duplicated same boilerplate (messy) script code for each source code package. In Windows, we had many different ways of installing programs. All this because "development tools handle it perfectly well".
Antti
Member
Member
Posts: 923
Joined: Thu Jul 05, 2012 5:12 am
Location: Finland

Re: Single source file

Post by Antti »

dozniak wrote:All in all, I think your idea is going in the right direction, look beyond files!
Thank you for your post. However, I am also a little bit afraid that this could lead to another extreme. I do not want that we must use some IDE to do anything. It should be reasonable easy to make changes manually also. Tools can only make this more convenient.
User avatar
iansjack
Member
Member
Posts: 4685
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Single source file

Post by iansjack »

I am very much against the idea of source files that require a particular IDE - or some other special system - to read and edit them. 10,000 line files may be fine in a folding IDE but they most definitely would not be in a simple text editor. Modularization has served well as a programming paradim for many years and I can see no good reason to throw it out now. What next - do away with functions, classes, methods and the like and write programs as one huge procedure?
Antti
Member
Member
Posts: 923
Joined: Thu Jul 05, 2012 5:12 am
Location: Finland

Re: Single source file

Post by Antti »

iansjack wrote:Modularization has served well as a programming paradim for many years and I can see no good reason to throw it out now.
Does that modularization rely on having multiple source files? More often than not, that single file is not "stand-alone-enough" to be considered as a module. Code is usually divided into multiple files because it is just easier for a programmer to edit and understand. I agree that it is possible to have very good modularization with multiple source files. However, some big programs usually have so long source code files anyway that it does not actually make so much difference to have an "even bigger" file and handle it with the IDE. The advantage of having "one source file, one executable" correspondence is quite significant at the end.
iansjack wrote:What next - do away with functions, classes, methods and the like and write programs as one huge procedure?
Probably not. We will see what improvements, if any, could be done there. But yes, "what next" is a good question. One step at a time.
Post Reply