Page 1 of 1

Yet another build system design

Posted: Sun Aug 23, 2009 8:15 pm
by NickJohnson
I know this path has been trodden before, but for the last few days I've been so frustrated trying to move from a recursive to non-recursive makefile setup (yes, I read the wiki article :wink: ), that I came up with my own build system. I realized that 90% of all the makefiles I ever write is the same, and that making general rules for a large project is nearly impossible, or very messy looking. So I'm trying to make this system be able to handle large projects with very little, centralized configuration by allowing more "intelligent" general rulesets. As of now, this system is to be called "bake".

The first section of a bake configuration file is a set of type definitions. These definitions are used to classify different targets - information that is then used to choose the proper rule. Definitions are regular expressions (extended style, probably POSIX) that match the name of the file being classified. If multiple expressions match, the one that is most strict is chosen:

Code: Select all

source: "\.c"
object: "\.o"
header: "\.h"
binary: ""
Within this set of types, a file named "main.c" would be type source, a file named "main.o" would be type object, a file named "common.h" would be type header, and a file named "a.out" would be type binary. However, note that "libfoo.a" would also be type binary.

There are also environment variables, which work exactly the same as in make:

Code: Select all

CC = gcc
CFLAGS = -Wall -Werror -pedantic
CFLAGS += -Os

SOURCES = main.c
HEADERS = common.h
FILES = $(SOURCES) $(HEADERS)
Rules do not describe how to create a target, but instead how to convert a type of file (or multiple types of files) into a different type of file. Syntax is similar to make, but more general, and curly-brace instead of tabbed. The variables $ and $(#) (where # is a number) correspond to the produced file and argument number, respectively:

Code: Select all

# produce object from source(s)
object << source {
    $(CC) $(CFLAGS) -c $1 -o $
}

# produce binary from object(s), rebuilding if headers are changed.
binary << object header {
    $(LD) $(LDFLAGS) $1 -o $
}
Using these rules, bake figures out a way to produce a target from a set of files. It tries to keep things in as many files as possible at each point: in the last example, each source file would become one object file and the objects all linked together, as opposed to all the source files being compiled into one object file then linked.

To produce a target from a set of files, use this syntax:

Code: Select all

a.out - main.c common.h

# With auto-detection
a.out - "\.c" "\.h"
General rules for creating targets are also possible, implying the set of files needed to produce a target:

Code: Select all

binary - "\.c" "\.h"

all {
    > a.out
}
At the end of the bake configuration file, files and targets can specify actions that must be performed before and after they are made, so that variables can be modified on a file-by file basis, and things can move around if needed. Variables modified by these scripts are only local - they do not affect the globally defined variables. Rules can also be overloaded from anywhere within. This syntax also provides a way to declare actions like "all" or "clean" or "install":

Code: Select all

all {
    > a.out # The > means create target
}

main.c {
    CFLAGS += -fomit-frame-pointer -O3
    echo Building main.c
    %% # This splits the "before" and "after" command lists
    echo Built main.c
}

foo.o {
    # Override source -> object rule for foo.c
    object << source {
        touch $
        echo No object for you!
    }
}
The fact that these actions are imperative means that the order of building can be preserved. However, it doesn't have to be. Actions separated by semicolons can be executed in parallel:

Code: Select all

all {
    > foo.a; > bar.a
    > foobar
    echo Done
}
This builds foo.a and bar.a simultaneously, then foobar (which assumedly depends on them).

From the command line, bake acts like make. The name of the target or action is specified as the first argument. The configuration file will have a standard name, which is searched for, as well as an environment variable specifying a "standard" configuration file. This way, even if there is no config file in a directory, the build system can still be used provided a source directory with no special rules (just compile and link everything with no flags).

This is just a rough draft of the design, and I haven't written any code yet. Does anyone think this is practical and worthwhile to write? Any ideas for new/better features?

Re: Yet another build system design

Posted: Mon Aug 24, 2009 4:10 am
by Solar
What's the benefit of using bake over make, that makes it worthwhile for the user to use a less-tested, less-familiar tool?

Re: Yet another build system design

Posted: Mon Aug 24, 2009 6:14 am
by Owen
Secondly, I think theres already a tool called Bake.

...

And nothing this offers me over CMake, which
1) I'm already familiar with
2) Is very well tested (KDE is built with it for example)
3) From the looks of things, more powerful
4) Integrated with CTest for automated testing & CDash for collation of those test results

Re: Yet another build system design

Posted: Mon Aug 24, 2009 6:44 am
by NickJohnson
Solar wrote:What's the benefit of using bake over make, that makes it worthwhile for the user to use a less-tested, less-familiar tool?
The main thing that it can do better than make is create general rules. In make, you either have to specify how to build each target, or how to build a set of targets based on a list of them (e.g. SOURCES, OBJECTS etc.). There is an ability in make to do this kind of general rule, but it is not expandable to all situations:

Code: Select all

.s.o:
    $(AS) $(ASFLAGS) $<
doesn't work for things without extensions, or directories.

To show how concise this makes the configuration file, this:

Code: Select all

csource: "\.c$"
ssource: "\.s$"
header: "\.h$"
library: "\.a$"
object: "\.o$"
binary: ""

lib_proj: "^lib.*/$"
bin_proj: "/$"

object << csource { $(CC) $(CFLAGS) -c $1 -o $ -I$&/inc }
object << ssource { $(AS) $(ASFLAGS) $1 -o $ }
binary << object { $(LD) $1 -o $ $(LDFLAGS) }
library << object {
	$(LD) $(1) -r -o $*.o $(LDFLAGS)
	$(AR) $(ARFLAGS) $ $*.o
}
bin_proj << binary header { cp $(1) $(BINDIR) }
lib_proj << library header {
	cp $(1) $(LIBDIR)
	cp $(2) $(INCDIR)
}

bin_proj - $."[^l].*\.[csh]"
lib_proj - $."\.[csh]"

init/ { LDFLAGS += -ldriver -lkernel }

all {
	> lib_proj
	> bin_proj
}
will compile my entire project, which is made of directories which are either libraries or binary projects (which is autodetected), which have unknown depth, special cases for two internal directories, and do it all in the proper order with maximum parallelization. All with one central, non-recursive file, that will never have to be changed regardless of how large the number of directories becomes. This also provides definitions for things that would be covered in the standard makefile, such as compiling C and assembly to object files, or turning objects into libraries. At best, it would look like this:

Code: Select all

lib_proj: "^lib.*/"
bin_proj: "/$"

bin_proj << binary header { cp $(1) $(BINDIR) }
lib_proj << library header {
	cp $(1) $(LIBDIR)
	cp $(2) $(INCDIR)
}

bin_proj - $."[^l].*\.[csh]"
lib_proj - $."\.[csh]"

init/ { LDFLAGS += -ldriver -lkernel }

all {
	> lib_proj
	> bin_proj
}
Imagine that building your whole project. All you need is consistent layout and naming.

Edit: Here's my best abstract description of why I think bake is better than make. In nearly all projects, all you really want to do is take a whole bunch of source files and turn them into one binary file. To a human (who knows how to compile things), such a task is entirely obvious given a directory containing some source. But in a makefile, you have to add things to lists, specify various special rules, etc. What I want is a build system that only takes an input list and output list, and fills in the middle entirely. Not only that, but one that can guess what the input list is from the output. So I only need to ask the system for a file, and it will build it for me.

Re: Yet another build system design

Posted: Mon Aug 24, 2009 12:50 pm
by Owen
Everything you want to do CMake can do. If it's more or less concise will depend upon how many directories there are in your project; but overall, it won't be onerous. You'll have to specify the file list by hand; you can use it's glob function to list all the files in a directory, but that gets run when CMake is invoked, not when the generated [Makefile, KDevelop Project, NMakefile, Visual Studio project, ...] is built, meaning you have to remember to run CMake again (Veurses adding it to the CMakeLists.txt, whereby cmake will be ran again automatically).

Re: Yet another build system design

Posted: Mon Aug 24, 2009 2:12 pm
by NickJohnson
Owen wrote:Everything you want to do CMake can do. If it's more or less concise will depend upon how many directories there are in your project; but overall, it won't be onerous. You'll have to specify the file list by hand; you can use it's glob function to list all the files in a directory, but that gets run when CMake is invoked, not when the generated [Makefile, KDevelop Project, NMakefile, Visual Studio project, ...] is built, meaning you have to remember to run CMake again (Veurses adding it to the CMakeLists.txt, whereby cmake will be ran again automatically).
But the whole point is that you can avoid file lists entirely by using regular expressions instead. And you don't have to regenerate scripts every time the directory structure changes. Everything is *possible* with cmake, but possible is not always trivial, although I admit cmake would work fine in this instance. I also want something scalable (for the developer, not time-wise) - something that doesn't become more complex the more directories you add.

Also, if you make a couple of changes to the syntax, you can make things even more compact (although quite cryptic). This file (with some variable definitions) could also build my entire system, and is equivalent to the previous version:

Code: Select all

"/$"      < $."/" | $."/\.h$" "\.a$" :
	cp $(1) $(BINDIR)

"lib.*/$" < $."\.a$" $."\.h$" :
	cp $(1) $(LIBDIR)
	cp $(2) $(INCDIR)

"/$"      - $."[^l].*\.[csh]" | "lib/.*a$"
"lib.*/$" - $."\.[csh]"
"init/"   - $."[^l].*\.[csh]" "lib/.*a$" : LDFLAGS += -ldriver -lkernel
"^$"      | "/$"
This will work on any repository that has the same structure as mine as well. :wink:

Re: Yet another build system design

Posted: Mon Aug 24, 2009 3:27 pm
by JohnnyTheDon
SCons also supplies most of the functionality you mentioned (except regular expressions, but python has those so its easy to implement yourself).

The problem with using your tool is that unless you write it in a scripting language so that it can be packaged with your project, you are forcing anyone who wants to build your project to find and build your tool before they can build your project. It is much easier (for your users) to use tools like make, CMake, or SCons that are popular and can be downloaded from a package manager.

I agree that make is a supreme PITA, but CMake and SCons work pretty well. Also, you probably don't want to use regular expressions to find files on every build. This is much slower at runtime than just adding files to your build script when you create them.

Re: Yet another build system design

Posted: Tue Aug 25, 2009 1:12 am
by Solar
I've got the impression that most, if not all of your "pros" for "bake" are from a lack of understanding of "make".
NickJohnson wrote:There is an ability in make to do this kind of general rule, but it is not expandable to all situations:

Code: Select all

.s.o:
    $(AS) $(ASFLAGS) $<
doesn't work for things without extensions, or directories.
How does "bake" system handle files without extensions?

As for directories:

Code: Select all

subdir/%.o: subdir/%.c
        $(CC) $(SUBOPTIONS) -o $@ -c $<

%.o: %.c
        $(CC) $(TOPOPTIONS) -o $@ -c $<
Having different rules for different directories works just fine. Or have I misunderstood your point? If yes, please elaborate.

As for your example, I am 95% sure I would be able to translate that 1:1 to a Makefile, but I'm a bit fuzzy about $& and some other details of your syntax ('bin_proj - $."[^1].*\.[csh]"'?). Perhaps if you give me a commented version telling me what each statement actually does?

Re: Yet another build system design

Posted: Tue Aug 25, 2009 8:34 am
by NickJohnson
Solar wrote:I've got the impression that most, if not all of your "pros" for "bake" are from a lack of understanding of "make".
NickJohnson wrote:There is an ability in make to do this kind of general rule, but it is not expandable to all situations:

Code: Select all

.s.o:
    $(AS) $(ASFLAGS) $<
doesn't work for things without extensions, or directories.
How does "bake" system handle files without extensions?
It uses a table of regular expressions to classify a file or directory. Because the strictest expression that matches a file determines it's type, if a file has no extension and there is also an expression that matches that has no extension, it will be classified as the type with no extension. For example, if I had the types:

binary: ""
source: "\.c$"
object: "\.o$"

A file called "main.c" would be classified as source, and a file called "noextension" would be classified as binary.
Solar wrote: As for directories:

Code: Select all

subdir/%.o: subdir/%.c
        $(CC) $(SUBOPTIONS) -o $@ -c $<

%.o: %.c
        $(CC) $(TOPOPTIONS) -o $@ -c $<
Having different rules for different directories works just fine. Or have I misunderstood your point? If yes, please elaborate.
My point is that you can have rules not only for specific subdirectories, but for subdirectories that match a regular expression in general. If I want, let's say, all the directories in the root of my repository to have a specific rule attached to them, I would classify them with the expression "^/[^/]*/$" (directories end with slashes). I don't have to add anything to the config file when I add a new subdirectory, or even have to enumerate the subdirectories to begin with. The expression also can be more strict, and not match directories you do not want included.
Solar wrote:As for your example, I am 95% sure I would be able to translate that 1:1 to a Makefile, but I'm a bit fuzzy about $& and some other details of your syntax ('bin_proj - $."[^1].*\.[csh]"'?). Perhaps if you give me a commented version telling me what each statement actually does?
I'm in a bit of a hurry right now, so I'll post a fully commented version later, but here's an explanation of that line. It means that when building a target of type bin_proj (a directory), it is implied that everything within the directory (at any depth) with .c, .s, or .h extensions is a dependency, and is used to produce the end result. The $ is a variable that contains the specific target that is being requested, and the . concatenates it with the regular expression to produce the expression that finds the source and headers.