Page 4 of 9

Re: Self hosting OS/Compiler from scratch

Posted: Tue Jan 21, 2014 4:15 pm
by Kevin
Brendan wrote:If you don't retain the benefits of my design (e.g. by doing a massive amount of work that hides the asynchronous nature under an "legacy emulation layer" that makes everything behave in a synchronous manner), I will do everything I possibly can to ensure your process will never be allowed to execute.
That's your personal decision, and I know and respect it. I wrote "Brendan will disagree" for a reason. But it doesn't necessarily mean that for other non-POSIX OSes it's not an option to create a native port. (But I think if you really put severe restrictions on what code will be allowed to run, developers might not like your OS very much.)
You solve it by removing complexity. Attempting to hide the complexity in a standard library does not remove the complexity, it just means that programmers have to deal with a complex standard library.
Complexity is in the requirements, not in the implementation (though a bad implementation can add unnecessary complexity). You can remove requirements, but users might not be happy about it.
You can do code sharing in many ways. Some are simple (e.g. cut & paste) and some are complex (e.g. templates). Similarly, you can do modularity in many ways (e.g. open source static libraries, closed/proprietary shared libraries, "services", etc), and some are simple and some are not.
Cut & paste is simple in the moment you do it. If you fix a bug in the code, you get to search all the copies of it, which isn't quite as simple any more. Especially if someone already made changes to some copies of the code without considering that there could be additional copies.

All the other options have their advantages and disadvantages as well. You can't call any of them inherently unnecessarily complex. It depends on the use case.
For an example, let's say you happen to use std::list in an application. You profile your application and find that std::list is a performance bottleneck, so you "hover over it and press F2" (or whatever) to open the std::list code in your IDE, then spend a few minutes to understand that code, and then tune it to perform better for your application's specific usage. This is so easy that anyone could do it (even a school kid that has only been programming for 3 weeks) because templates are awesome, right?

Wrong. More likely is that someone who has spent many thousands of $ and several years at university just to prepare themselves for the over-complicated language happens to open the std::list code in their IDE, vomits, then implements their own list handling for their special case because it's a whole lot easier.
I don't really speak C++, but somehow an assumption sneaked in that the standard library consists of badly written code, if and only if it contains generics. I don't see it backed by anything. Also, people who write bad code in the standard library would write the same bad code in an application as well, so I don't really see the improvement that you get from moving the list implementation to each single application. (Oh, by the way: In at least 99% of the cases, the implementation of the list is not a performance bottleneck. Surprise!)
Do you understand that it's extremely easy to have a brain fart and focus on trivial things (e.g. easily fixed mistakes) while ignore the larger picture (thousands of lines of convoluted "factory factory" crap that requires a degree in astrophysics to get away from quickly enough)?
Where did you ge the "factory factory crap" from?
If I have a piece of code that I can see, and you have a piece of code that's 300 times more complex that you can't see, who has more complexity?
I don't agree at all with the notion that a generic list is more complex than one for a specific type of objects. But even if it were so, the 300 times more complex thing is in the standard library, so someone else's problem. I have less complexity to maintain than you have.
I'm not arguing against the reuse of algorithms. I'm arguing for the reuse of algorithms in an open and easily understood, modified and customised form. I'm only arguing against the reuse of algorithms in an overly complicated pain in the neck form that's hidden behind your back in a massive library of puke.
Considering that all of this is in reply to a post where I said that generics are useful, you're saying that generics provide an overly complicated pain in the neck form that's hidden behind your back in a massive library of puke. Did I get that right? Perhaps you'd like to elaborate why you think so.
Your forum signature has a link to the tyndur wiki pages, and the web browser I'm using translates German into English fairly well.
Okay. That's not my personal OS, but a community OS in which I'm participating. Only perhaps a third of it is written by me. Anyway, what I described for my own OS applies to tyndur as well, through both I gained experience.
Kevin wrote:
They're people who can learn a language that's about as complex as C quickly enough to satisfy their curiosity and keep them interested; but they're also people who will decide to forget it if it looks complicated rather than fun.
Okay. Let's take C then and make it a bit easier and less error-prone by adding a few features like generics (perhaps also remove some features), and they'll be happy. In fact, generics are the one feature that I miss most in C, we might not need to add much more. But trying to emulate it with preprocessor magic is ugly and error-prone.
For beginners, most of the pitfalls of C are much more fundamental than that. They're things like confusion about the differences between arrays and pointers, why "x = a * b / c;" is giving them wrong results, accidentally relying on either undefined or implementation defined behaviour, bugs involving misunderstanding operator precedence or promotion rules or order of evaluation, function pointer syntax, all types of "out of bounds" problems, etc.
I didn't pick C as the base because I think it is the greatest language ever, but because you specifically mentioned C as good enough for these beginners. My reasoning is just that when I take a "good enough" language and improve it a little, it must still be good enough.
C++ is definitely a low level language (pointers, no GC, etc). C++ is also definitely a high level language (attempting to replace memory management with object creation and destruction, standard container abstractions, etc). This doesn't make it a "middle level language", it just makes it schizophrenic.
Not sure why you're talking about C++ now. I never mentioned it anywhere. I didn't even talk about templates, I always called the feature generics. As I said, I don't really speak C++, but from what people say I understand that its templates allow insanity that isn't required at all for generic programming. So you shouldn't read generics = C++ templates.

Somehow you seem to think that generics are inherently hard to use correctly and automatically increase complexity by a huge factor. I'm not sure why that is, I never had a lot of trouble with them. I have used the Ada and Java (perhaps Free Pascal as well, don't remember) flavours of them.

Re: Self hosting OS/Compiler from scratch

Posted: Tue Jan 21, 2014 7:50 pm
by Brendan
Hi,
skeen wrote:
Brendan wrote:For what I'm planning to do (where executables are "portable" and compiled to native code the first time they're executed); there's only 2 cases - either everything has to be compiled (e.g. user hasn't used the executable since the "portable" version was installed), or nothing needs to be compiled (e.g. user has used the executable before so there's already a native executable).
How long would you accept the startup/compilation time to be, and how large of a the runtime impact is acceptable? - Because in terms of supporting 'slow-compiled' languages in this approach, one has either 1. compile everything up-front and accept the delay (which obviously needs to be considerably small), or 2. do lazy compilation whenever something is needed (when a function is called), or when it's likely to be needed soon (i.e. by lambda closure of functions dependencies).
I remember seeing studies about users and web server response times that indicated an 8 second limit between requesting something and seeing it. Basically (for web), if the user can't see it within 8 seconds they get impatient and/or assume something is wrong.

Using that as a rough basis for GUI apps, that gives a maximum of 8 seconds to compile to native, start it running, and then get a window displayed. You could probably split that in halves - up to 4 seconds for compiling, then up to 4 seconds to start the resulting executable and display a window. That's probably not the best way to split the 8 seconds (shouldn't take too long to start a process and display a window). However; I personally think 8 seconds is a large amount of time. It sounds short, but when you're waiting it's not. For an example, after reaching the end of this sentence, sit still (don't touch anything or move a muscle) for an entire 8 seconds.

Now let's consider a reasonably large application, like Bochs. That's a little more than 4 MiB of (compressed) source code and probably takes around 10 minutes to compile on my machine; but my machine is slightly faster than an old 80486. If we assume the user has half the CPU power that I do (e.g. a quad-core i7 rather than a pair of quad-core Xeons) we could assume it might take 20 minutes (or 1200 seconds). I want that cut down to about 4 seconds, or about 600 times faster than GCC. Sounds easy enough!

Now think about how "compiling" normally works. Lots of separate little source files (and all the headers they include) with disk heads bouncing all over the place trying to read them. If that's not bad enough, you're writing individual object files after compiling and then reading them back during linking. It's like it's been designed as a VFS stress test. On top of that there's the overhead of process creation (and termination) to get the compiler running for each source file; then there's a bunch of CPU time spent on parsing plain text into "tokens", doing syntax/sanity checks, building a list of user defined types, getting a symbol table, etc. Mostly; it's all incredibly stupid.

You might be wondering why this is "standard practice". There's 2 very good reasons for this....

The first reason is that it's impossible to find a computer with enough memory; so we have to break "compiling" into several separate steps (pre-process, compile, assembly, then link); and we also have to break the source code into many separate files. That way you only need to fit a small part of the "input", a small part of the "output" and only part of the compiler's code and data in memory at the same time (and you can use disk space to store all the intermediate files). It's the only way we can make a compiler that actually works on a typical computer that only has 20 KiB of RAM. Of course this was back in 1970. We have computers with more RAM now, so maybe this isn't a good reason at all now. ;)

The second reason is that there's no way on earth we'll ever see a CPU that's fast enough to do things like tokenising and maintaining lists of user defined types and symbols while a person is typing. I mean, do you want to press a key and wait 30 seconds? That's not going be anywhere near practical; so we have to push all that processing onto the compiler, and then compile the code as a batch job over the weekend. It's the only way. At least, it was the only way. Hmm, maybe this isn't a good reason either. :roll:

By forgetting about "standard practice" it'd be easy to slash compile times by a factor of 10. Instead of compiling something like Bochs in 1200 seconds that brings it down to 120 seconds. Still far too slow, but a little closer.

Here's a known, provable fact: assembly language is very fast, and C++ is very slow.

I'm not talking about the performance of the resulting code though - I'm talking about the speed of the tools. Most assemblers are IO bound, even with the overhead of parsing plain text. Most C++ compilers spend 3 days of CPU time just trying to figure out what "a * b" actually means; because let's face it, the compiler can't look at a tiny piece of code in isolation and know exactly what it does with no chance of being surprised/misled, and therefore can't look at a group of tiny pieces of code (an expression?) and know what it does with no chance of being surprised/misled, or... 8)

I'm fairly sure you get what I'm saying here - once you fix the "standard practice stupidity" problem, complex languages are the next largest cause of long build times. To get fast compile times you can't use complex languages. By using a very simple language you could easily slash compile times by a factor of 10 (once you've fixed "standard practice"). Instead of compiling something like Bochs in 120 seconds that might bring it down to 12 seconds. Still far too slow, but a little closer.

Sadly, there's "simple" and there's "too simple". To get compile times down that much you're going to have to get rid of a lot of good things, like type checking (in case someone does "unsigned int = x && 1;"), and making sure you don't allow variables to be set to out of range values (in case someone does "char c = EOF;"), and making sure grammar is sane (in case someone forgot a "break;" after a case statement) and all sorts of other things that catch obvious bugs and make programmer's lives easier.

The end user doesn't need all that sanity checking though - it's only really useful to the programmer. What you want is a tool that the programmer uses that does all the sanity checking, and for users to download/install files that have already passed all the sanity checks. Note: This also speeds things up a little for programmers like me, who use the compiler as a sanity checking tool and often compile without executing the result at all.

Of course if the programmers have to run some sort of sanity checking tool anyway; then that tool could also do "pre-compiling" work. Things like stripping comments; and doing constant folding, dead code elimination, inlining functions that are only used once, common subexpression elimination, etc. It could even insert various hints that the end-user's compiler can use, like how likely it is that an "if" statement's condition will be true or whatever.

That way the end-user's compiler doesn't just get a very simple language that it can compile very quickly, it gets a very simple language where half the work is already done. With half the work already done, something like Bochs wouldn't just be compiled in 12 seconds - it'd be down to 6 seconds.

Now let's think about your typical executables - often they're a big blob of code for a single process. For my OS (where the OS decides which computer on the LAN a process should be executed on for load balancing purposes) this isn't ideal. Instead of having one process per application, you want that broken up into several processes per application so that the load can be distributed across several computers. For a GUI application you'd expect at least 2 processes (user interface and "back end"). For something like Bochs I'd be wanting lots of processes (one for each emulated device, and one for the virtual CPU/s and RAM). When something is split into several processes, those processes can be run on different computers, but those executable files can also be compiled on different computers in parallel. That doesn't help when there's only one computer though.

The other thing is that if you're splitting applications into multiple processes, you notice that some can be used by multiple applications. A spell checker is a good example - there's no real reason you can't have one spell checker "service" that's used by many different applications (at that same time). You could have 3 different emulators all sharing the same "virtual PCI device" executables. Mostly anything that does a reasonable amount of processing (enough to justify the IPC costs) can be ripped out and implemented as a service.

You also notice that not all processes are needed when the application starts - you might compile and start a word processor "user interface" process, then (after it started) find out it wants a word processor "back end" process and compile that, then find out it wants a spell checker that's already been compiled because some other application used it before. Something like Bochs might have a "virtual machine manager" that starts first, and while the user is thinking (about whether to drag and drop the "NE2000 ethernet" icon into the virtual machine they're configuring) the OS could be compiling the process that emulates a sound card.

Basically, by ignoring "standard practice", using careful design and doing a few clever tricks; I think I can get just about anything compiled fast enough.
skeen wrote:
Brendan wrote:So you see a line of code that says "a = c * b;" and you decide it'd be a little nicer in alphabetical order and change it to "a = b * c;". It's just multiplication, so obviously this is fine. You commit it to whatever versioning system (commit note is "Minor code clean ups") and forget about it.

4 weeks later a release is made. When you get the bug report from frustrated users in 6 weeks time (which mostly all just say "code doesn't work right" with no useful details) and then spend 4 entire days trying to find a problem before you eventually discover the matrix multiplication bug; then you can pat yourself on the back and say "that wasn't complex at all!".. :)
First of all, I don't see why changing the order of the arguments should break the code. Multiplication has the commutative property, so should it's implementation, and hence changing the order of the arguments, shouldn't have any effect at all on the output.
Matrix multiplication is not commutative (except for a few special cases).
skeen wrote:But assuming that your implementation of the multiplication operator does not do what it should, then I don't see how this is different from; "a = matrix_mul(c,b)" ==> "a = matrix_mul(b,c)", if the implementation of this function doesn't have the commutative property either, then you have the exact same issue.
You don't have the exact same issue. Anyone that knows matrix multiplication isn't commutative wouldn't have created the bug (as you've made it obvious that it's not normal multiplication), and anyone that doesn't know what a matrix is wouldn't have created the bug (as you've made it obvious that it's not normal multiplication). Basically, the bug is still possible, but the chance of the bug actually happening is now a lot lower (as the number of people that now about matrices but also don't know matrix multiplication isn't commutative is very low).
skeen wrote:
Brendan wrote:What you're saying is that templates are good as long as you only use existing templates and nobody ever writes them, maintains them, improves them or debugs them.
My point is that, when you're using existing templates, then you don't have to maintain, improve or debug them, and that's good!
Why are you a programmer at all? Surely it's easier to use software that someone else wrote for you; so you don't have to maintain, improve or debug anything!
skeen wrote:However I do think you're exaggerating, on how advanced it is to write templates; for containers it's mostly replacing your payload data type with T. For template meta-programming, then yes, things may seem somewhat advances, but this is mainly because you leave OOP and procedural programming behind, and venture into the world of functional programming (which is truly a challenge if you never been there before).
If all you do is replace your payload data type with T, then you've probably ended up with a template that works with some data types (e.g. integers) and fails with others (e.g. strings); and while you may think that was easy some other sucker that tries to use it only to find it doesn't work probably won't think it's so easy.
skeen wrote:
Brendan wrote:
skeen wrote:If I want to use a list, then all I want is to be able to store my data (the interface), how the data is stored is up to the implementation, which I expect to do the right thing, i.e. split my data into hot and cold sections, ect.
Ah, I see. Before using "std::list" you send an email to the person that wrote it and let them know which pieces of your data are hot and which pieces are cold; and they optimise the generic implementation for your specific case (and "de-optimise" it for other people's specific cases). That way, you don't have to look at or understand the template, or deal with the complexity of templates, because you're still only using existing templates. Nice!
You do not send an email to the person, who wrote the list, instead you hint the implementation with information, i.e. hint which pieces of the data are cold/hot. One of the things I'm working with currently, is using the profiler's output to generate a header file, which is used to hint my generic libraries of optimization possibilities, in each given usage context.

C++11 provides a powerful way to do this, namely user defined literals, which are pretty much strings, that can be parsed at compile time, and used to create objects. The literals can for instance profiler information.
Given that C (and I assume C++ too) isn't even allowed to do basic things like reorder a structure's members; it must be very simple to use a simple template and a simple user defined literal and make the compiler split a single structure into 2 completely different structures.
skeen wrote:
Brendan wrote:I think (in general) people don't fully understand what I mean by "simple" (or its opposite, "complex").

My idea of simple is that you can look at a tiny piece of code in isolation and know exactly what it does with no chance of being surprised/misled; and therefore you can look at group of tiny pieces (a line of code?) and know exactly what it does with no chance of being surprised/misled; and therefore you can look at a group of groups (a function?) and know exactly what it does with no chance of being surprised/misled; and so on, all the way up to understanding every single piece of an extremely large program with no chance of being surprised/misled.

If you can't look at a tiny piece of code in isolation (e.g. like "a * b") and know exactly what it does with no chance of being surprised/misled; then you can't be sure you understand anything correctly at all.

For example, for C, you can't assume that someone hasn't done this somewhere:

Code: Select all

#define a foo +
int bar;
int *b = &bar;
..and therefore you can't assume that a tiny piece of code like "a * b" isn't actually doing something like "foo + bar".

That's not simple. That's confusion that increases the chance of introducing bugs. It's complex.
I truly agree that one should be able to understand the code by inspecting a few lines at a time. Nobody has the ability to comprehend the entire complexity of large software projects, and hence we need to hide complexity to the best of our knowledge. However I do not think that "a * b", is any worse than "matrix_mul(a, b)", and as for your example, one could also use a macro, to replace "matrix_mul" with something that does addition.
Hiding complexity just makes finding the complexity more complex! You don't want to hide it, you want to shift it to an obvious place so that it's out of the way but not hidden. A single '*' character doesn't tell you anything is different or where the operator was overloaded.

Note: For my language, there's no pre-processor and no macros either. For a tiny piece of code like "a * b" (and without any other information whatsoever) you know it's definitely normal multiplication, and (if it compiles without error) you also know that both 'a' and 'b' use a numerical type (integer or floating point) and neither can be a pointer, boolean, structure, etc.


Cheers,

Brendan

Re: Self hosting OS/Compiler from scratch

Posted: Tue Jan 21, 2014 8:23 pm
by Owen
Brendan,

Here would be a good opportunity to test how fast you can make your AOT compiler:

Pick a random program. Say, Bochs, because you mentioned it.

Now make your compiler (GCC, Clang, whatever) use link time optimization, and time the link step.

Because that link step isn't doing anything your AOT compiler won't have to (With both GCC and Clang its' taking IR and compiling it into actual binary)

You'll probably find you're still way behind where you want to be (because compilation is full of NP-hard problems). And, hell, I've seen an (admittedly crazy!) library which took 5 minutes to link without link time optimization.

(This was in no small part because it was peppered with symbols thousands of characters long. Lots of them. With lots of common prefixes.. and with loads and loads of hash table collisions)

Re: Self hosting OS/Compiler from scratch

Posted: Tue Jan 21, 2014 10:48 pm
by Brendan
Hi,
Kevin wrote:
Brendan wrote:If you don't retain the benefits of my design (e.g. by doing a massive amount of work that hides the asynchronous nature under an "legacy emulation layer" that makes everything behave in a synchronous manner), I will do everything I possibly can to ensure your process will never be allowed to execute.
That's your personal decision, and I know and respect it. I wrote "Brendan will disagree" for a reason. But it doesn't necessarily mean that for other non-POSIX OSes it's not an option to create a native port. (But I think if you really put severe restrictions on what code will be allowed to run, developers might not like your OS very much.)
I think the opposite. If you start porting something like GCC, then you're going to have to have binutils, and then someone will beg for nano or emacs or vim, and someone else will port python, and then Apache, and...

The next thing you know people are doing benchmarks that show all the apps are slower than they are on Linux and FreeBSD, and some guy calling himself Richard is going around saying the OS should be referred to as "GNU/Týndur"; and if there is anything special about the OS nobody knows or cares because none of the applications that were ported use any of your OS's unique features and nobody is going to write any native application that makes your OS look good because it's easier to just port something.
Kevin wrote:
You solve it by removing complexity. Attempting to hide the complexity in a standard library does not remove the complexity, it just means that programmers have to deal with a complex standard library.
Complexity is in the requirements, not in the implementation (though a bad implementation can add unnecessary complexity). You can remove requirements, but users might not be happy about it.
For one special case, the complexity is in the requirements for that one special case. For generic code the complexity is in the requirements for every special case that anyone can ever think of.

10 separate pieces of code (for 10 special cases) is simpler than one generic piece of code that's 10 times as complex, especially if you only need to look at one of those separate pieces of code.
Kevin wrote:
You can do code sharing in many ways. Some are simple (e.g. cut & paste) and some are complex (e.g. templates). Similarly, you can do modularity in many ways (e.g. open source static libraries, closed/proprietary shared libraries, "services", etc), and some are simple and some are not.
Cut & paste is simple in the moment you do it. If you fix a bug in the code, you get to search all the copies of it, which isn't quite as simple any more. Especially if someone already made changes to some copies of the code without considering that there could be additional copies.
I've already got provisions for that. The idea is that you set the "origin" of a piece of code (e.g. a URL), and the IDE will check for updated versions and let you know if there's been changes to the original, and assist with updating your project's copy. The nice thing about this is that it requires explicit action from the programmers responsible for the project - dodgy/incompatible "upstream" changes won't automatically break your code (unlike third-party libraries that either can break your code or will result in version checking dependency mess).
Kevin wrote:
For an example, let's say you happen to use std::list in an application. You profile your application and find that std::list is a performance bottleneck, so you "hover over it and press F2" (or whatever) to open the std::list code in your IDE, then spend a few minutes to understand that code, and then tune it to perform better for your application's specific usage. This is so easy that anyone could do it (even a school kid that has only been programming for 3 weeks) because templates are awesome, right?

Wrong. More likely is that someone who has spent many thousands of $ and several years at university just to prepare themselves for the over-complicated language happens to open the std::list code in their IDE, vomits, then implements their own list handling for their special case because it's a whole lot easier.
I don't really speak C++, but somehow an assumption sneaked in that the standard library consists of badly written code, if and only if it contains generics. I don't see it backed by anything. Also, people who write bad code in the standard library would write the same bad code in an application as well, so I don't really see the improvement that you get from moving the list implementation to each single application. (Oh, by the way: In at least 99% of the cases, the implementation of the list is not a performance bottleneck. Surprise!)
I didn't assume that the C++ standard library consists of badly written code, I assumed that the C++ standard library consists of code that correctly handles all possible corner-cases in all possible situations, and is therefore well written and very complex; and "generic" rather than optimised specifically for your special case.

Oh, by the way: in at least 99% of all cases, an example is only an example.
Kevin wrote:
Do you understand that it's extremely easy to have a brain fart and focus on trivial things (e.g. easily fixed mistakes) while ignore the larger picture (thousands of lines of convoluted "factory factory" crap that requires a degree in astrophysics to get away from quickly enough)?
Where did you ge the "factory factory crap" from?
Surely if one level of stupidity makes things simpler, then 2 levels of stupidity should make things twice as simple. I mean; why write code (templates) that produce code at compile time when you could write code (meta-templates) that produce code (templates) that produce code at compile time?

What we really should be doing is using something like Perl to generate meta-templates that generate templates that generate code; so that we can have a "factory factory factory" and make everything 3 times as simple!
Kevin wrote:
If I have a piece of code that I can see, and you have a piece of code that's 300 times more complex that you can't see, who has more complexity?
I don't agree at all with the notion that a generic list is more complex than one for a specific type of objects. But even if it were so, the 300 times more complex thing is in the standard library, so someone else's problem. I have less complexity to maintain than you have.
Look at this topic's title. If you were designing your own language and writing your own compiler from scratch, then you're the one that's going to be writing those templates. It's not someone else's problem.
Kevin wrote:
I'm not arguing against the reuse of algorithms. I'm arguing for the reuse of algorithms in an open and easily understood, modified and customised form. I'm only arguing against the reuse of algorithms in an overly complicated pain in the neck form that's hidden behind your back in a massive library of puke.
Considering that all of this is in reply to a post where I said that generics are useful, you're saying that generics provide an overly complicated pain in the neck form that's hidden behind your back in a massive library of puke. Did I get that right? Perhaps you'd like to elaborate why you think so.
Why don't you think so? If you want to see implementation, here's some implementation for the "3 lines of deleting an entry of a list" that was mentioned previously (from Apache's C++ standard library in their SVN):

Code: Select all

void pop_back () {
        _RWSTD_ASSERT (!empty ());
        erase (--end ());
    }
Except, that's not all of it - it mostly just calls things elsewhere. Here's something that might be the "end()":

Code: Select all

iterator end () {
        return _C_make_iter (_C_node);
    }
Well, more "things that call things" that provide no real clue of anything actually happening, yay! Here's an "erase()":

Code: Select all

list<_TypeT, _Allocator>::erase (iterator __it)
{
    _RWSTD_ASSERT_RANGE (begin (), __it);

    if (__it == end ())
        return end ();

    iterator __tmp =
        _C_make_iter (_C_link_type ((*_ITER_NODE (__it))._C_next));

    (*(_C_link_type ((*_ITER_NODE (__it))._C_prev)))._C_next =
        (*_ITER_NODE (__it))._C_next;
    (*(_C_link_type ((*_ITER_NODE (__it))._C_next)))._C_prev =
        (*_ITER_NODE (__it))._C_prev;

    --_C_length;

    _RWSTD_VALUE_ALLOC (_C_value_alloc_type, *this,
                        destroy (_RWSTD_VALUE_ALLOC (_C_value_alloc_type,
                            *this, address ((*_ITER_NODE (__it))._C_data))));
    _C_put_node (_ITER_NODE (__it));

    return __tmp;
}
Now, this is still a confusing mess that just calls things elsewhere. I either couldn't find "_C_make_iter" or I found 4 of them (I honestly don't know). I'm not too sure what's happening with "_ITER_NODE" (it's a macro that's undefined then redefined in the middle of the file for no obvious reason, that seems to expand to a longer version of itself). The "--_C_length;" part I like the look of (but I suspect they aren't actually reducing the length of the C language there). The whole thing just looks like the aftermath of an explosion in a secret government "bracket and underscore" storage facility to me; and I'm still no closer to understanding what any of it actually does underneath all the macros and calls and spaghetti.

Of course this is just a "simple" doubly linked list. I couldn't find a nice and lean singly linked list, and I wouldn't dare risk a glance at something fancy like a FIFO queue.
Kevin wrote:
They're people who can learn a language that's about as complex as C quickly enough to satisfy their curiosity and keep them interested; but they're also people who will decide to forget it if it looks complicated rather than fun.
Okay. Let's take C then and make it a bit easier and less error-prone by adding a few features like generics (perhaps also remove some features), and they'll be happy. In fact, generics are the one feature that I miss most in C, we might not need to add much more. But trying to emulate it with preprocessor magic is ugly and error-prone.
For beginners, most of the pitfalls of C are much more fundamental than that. They're things like confusion about the differences between arrays and pointers, why "x = a * b / c;" is giving them wrong results, accidentally relying on either undefined or implementation defined behaviour, bugs involving misunderstanding operator precedence or promotion rules or order of evaluation, function pointer syntax, all types of "out of bounds" problems, etc.
I didn't pick C as the base because I think it is the greatest language ever, but because you specifically mentioned C as good enough for these beginners. My reasoning is just that when I take a "good enough" language and improve it a little, it must still be good enough.[/quote]

Your reasoning is that if a language is "good enough for people to learn quickly", you'll make it "better" by placing a huge hurdle in there to slow beginners down.

The thing is, you don't want beginners to deal with writing generics either. You just want shrink wrapped code that they can use but not read or write. In that case, build things like containers into the language itself instead of using some Rube Goldberg contraption to duct-tape them onto the side. It certainly wouldn't be the first language to have something like linked lists as a built in feature.
Somehow you seem to think that generics are inherently hard to use correctly and automatically increase complexity by a huge factor. I'm not sure why that is, I never had a lot of trouble with them. I have used the Ada and Java (perhaps Free Pascal as well, don't remember) flavours of them.
Words like "huge" and "hard" are relative. A drop of water is huge and hard (if you're an ant moving at super-sonic speeds) and you'll find that cars are small and soft if you drop a boulder from plane.

What I do think is that the benefits are not worth the increase in difficulty/complexity; regardless of how large/small the increase is.


Cheers,

Brendan

Re: Self hosting OS/Compiler from scratch

Posted: Tue Jan 21, 2014 11:06 pm
by Brendan
Hi,
Owen wrote:Pick a random program. Say, Bochs, because you mentioned it.

Now make your compiler (GCC, Clang, whatever) use link time optimization, and time the link step.
Sounds like a good theory, except that for link-time optimisation the linker is still loading many object files, and the compiler has made optimisation harder by discarding useful information (e.g. I doubt the GIMPLE information that is added includes a call graph or anything), and that it's less than 2 years old (unlikely to be as fast as possible), and that it's an afterthought rather than part of the original design of GCC (likely to suck even when it has matured), and that it wouldn't include the difference smaller processes/executables would make, and that it wouldn't include any of the benefits of doing "whole program pre-optimisation" beforehand.


Cheers,

Brendan

Re: Self hosting OS/Compiler from scratch

Posted: Tue Jan 21, 2014 11:07 pm
by Joshw
Geez... this crowd might be a little too intense for me... :)

Josh

Re: Self hosting OS/Compiler from scratch

Posted: Wed Jan 22, 2014 3:06 am
by Kevin
Brendan wrote:Now let's consider a reasonably large application, like Bochs. That's a little more than 4 MiB of (compressed) source code and probably takes around 10 minutes to compile on my machine; but my machine is slightly faster than an old 80486. If we assume the user has half the CPU power that I do (e.g. a quad-core i7 rather than a pair of quad-core Xeons) we could assume it might take 20 minutes (or 1200 seconds). I want that cut down to about 4 seconds, or about 600 times faster than GCC. Sounds easy enough!
You're making up numbers out of thin air in your whole posting. This one is easy enough to try out:

Code: Select all

$ time make -j4
...
real    0m19.122s
user    1m4.133s
sys     0m4.471s
Your factors of 10 can't be tried out because both your tools and the Bochs source in the modified language are missing. I suspect they aren't much more accurate than this one.
That way the end-user's compiler doesn't just get a very simple language that it can compile very quickly, it gets a very simple language where half the work is already done. With half the work already done, something like Bochs wouldn't just be compiled in 12 seconds - it'd be down to 6 seconds.
True. But now the end user doesn't have source code any more, but an intermediate representation. Again, you haven't got rid of complexity, you just moved it to somewhere else. (Which is fine, compiling source code shouldn't be the job of the end user. There are reasons why all widespread OSes and applications come in binary form and source is only additional, if available at all.)

The only problem is that now you shouldn't be comparing to a real compile process any more, but rather to the startup of a Java application. You can, of course, keep comparing to a full compile, and your solution will look great, but that's self-deceiving.
For something like Bochs I'd be wanting lots of processes (one for each emulated device, and one for the virtual CPU/s and RAM). When something is split into several processes, those processes can be run on different computers, but those executable files can also be compiled on different computers in parallel. That doesn't help when there's only one computer though.
This is the final step that is still missing to slow down Bochs to a level that it runs backwards.
The other thing is that if you're splitting applications into multiple processes, you notice that some can be used by multiple applications. A spell checker is a good example - there's no real reason you can't have one spell checker "service" that's used by many different applications (at that same time). You could have 3 different emulators all sharing the same "virtual PCI device" executables. Mostly anything that does a reasonable amount of processing (enough to justify the IPC costs) can be ripped out and implemented as a service.
That's a great idea. Mainstream OSes call it shared libraries. (You can, of course, use other protocols and use it over the network if you like to slow it down. The OS could even make it transparent, except for the timing.)
Why are you a programmer at all? Surely it's easier to use software that someone else wrote for you; so you don't have to maintain, improve or debug anything!
My guess is so he can't solve yet unsolved problems. Not so much because he loves reinventing the wheel. (Reinventing it once can be fun, but it gets tiresome if you have to do it all the time.)

Re: Self hosting OS/Compiler from scratch

Posted: Wed Jan 22, 2014 7:06 am
by skeen
Brendan wrote: I remember seeing studies about users and web server response times that indicated an 8 second limit between requesting something and seeing it. Basically (for web), if the user can't see it within 8 seconds they get impatient and/or assume something is wrong.

Using that as a rough basis for GUI apps, that gives a maximum of 8 seconds to compile to native, start it running, and then get a window displayed. You could probably split that in halves - up to 4 seconds for compiling, then up to 4 seconds to start the resulting executable and display a window. That's probably not the best way to split the 8 seconds (shouldn't take too long to start a process and display a window). However; I personally think 8 seconds is a large amount of time. It sounds short, but when you're waiting it's not. For an example, after reaching the end of this sentence, sit still (don't touch anything or move a muscle) for an entire 8 seconds.
We can easily agree, that 8 seconds is a looong time, and too long indeed.
Brendan wrote: Now let's consider a reasonably large application, like Bochs. That's a little more than 4 MiB of (compressed) source code and probably takes around 10 minutes to compile on my machine; but my machine is slightly faster than an old 80486. If we assume the user has half the CPU power that I do (e.g. a quad-core i7 rather than a pair of quad-core Xeons) we could assume it might take 20 minutes (or 1200 seconds). I want that cut down to about 4 seconds, or about 600 times faster than GCC. Sounds easy enough!

Now think about how "compiling" normally works. Lots of separate little source files (and all the headers they include) with disk heads bouncing all over the place trying to read them. If that's not bad enough, you're writing individual object files after compiling and then reading them back during linking. It's like it's been designed as a VFS stress test. On top of that there's the overhead of process creation (and termination) to get the compiler running for each source file; then there's a bunch of CPU time spent on parsing plain text into "tokens", doing syntax/sanity checks, building a list of user defined types, getting a symbol table, etc. Mostly; it's all incredibly stupid.

You might be wondering why this is "standard practice". There's 2 very good reasons for this....

The first reason is that it's impossible to find a computer with enough memory; so we have to break "compiling" into several separate steps (pre-process, compile, assembly, then link); and we also have to break the source code into many separate files. That way you only need to fit a small part of the "input", a small part of the "output" and only part of the compiler's code and data in memory at the same time (and you can use disk space to store all the intermediate files). It's the only way we can make a compiler that actually works on a typical computer that only has 20 KiB of RAM. Of course this was back in 1970. We have computers with more RAM now, so maybe this isn't a good reason at all now. ;)

The second reason is that there's no way on earth we'll ever see a CPU that's fast enough to do things like tokenising and maintaining lists of user defined types and symbols while a person is typing. I mean, do you want to press a key and wait 30 seconds? That's not going be anywhere near practical; so we have to push all that processing onto the compiler, and then compile the code as a batch job over the weekend. It's the only way. At least, it was the only way. Hmm, maybe this isn't a good reason either. :roll:

By forgetting about "standard practice" it'd be easy to slash compile times by a factor of 10. Instead of compiling something like Bochs in 1200 seconds that brings it down to 120 seconds. Still far too slow, but a little closer.
As I started before, I don't mind my compile times being long, if the incremental compile time is low (and thereby the debug cycle time). Having a low incremental compile time, can be achieved for instance by compiling small potions of the code at a time. This may be troublesome, whenever you need to recompile the hole program, however this is something you should only do rarely, and whenever you have a break off, ect.

That being said; we can easily agree, that compilers and some languages are indeed designed for how the world was back in 1970. However providing a nice type system, and static analysis is indeed something that was not entirely possible back in 1970, and is one of the main areas to look out for in compilers in the years to come. As this is where bad programmers like me gets help and guidance, I'm merely a human being, and hence I do write buggy code, so whenever there's a tool to tell me I did something wrong; I welcome it.

Clang is currently working on a list of helpful tools, called sanitizers, which do a lot of static and compiler enabled runtime analysis of the code. Their goal is to catch 80% of bugs with static analysis and the rest via runtime analysis.

Chandler Carruth did a presentation about this, at Going Native last year. If you guys haven't checked it out, consider doing so, the talk can be found at;
http://channel9.msdn.com/Events/GoingNa ... -s-Dragons
Brendan wrote: Here's a known, provable fact: assembly language is very fast, and C++ is very slow.

I'm not talking about the performance of the resulting code though - I'm talking about the speed of the tools. Most assemblers are IO bound, even with the overhead of parsing plain text. Most C++ compilers spend 3 days of CPU time just trying to figure out what "a * b" actually means; because let's face it, the compiler can't look at a tiny piece of code in isolation and know exactly what it does with no chance of being surprised/misled, and therefore can't look at a group of tiny pieces of code (an expression?) and know what it does with no chance of being surprised/misled, or... 8)
I get the point that assembly language is indeed very fast to compile, but I actually don't think that's a huuge advantage.

I'd rather have a slow compiled language, which offers me the support, warnings and errors that I need in order to ensure program correctness, rather than I'd have a language which compiles super fast. Because I can compile my slow high level language to some low level language, which is reasonably fast to compile.
Brendan wrote: I'm fairly sure you get what I'm saying here - once you fix the "standard practice stupidity" problem, complex languages are the next largest cause of long build times. To get fast compile times you can't use complex languages. By using a very simple language you could easily slash compile times by a factor of 10 (once you've fixed "standard practice"). Instead of compiling something like Bochs in 120 seconds that might bring it down to 12 seconds. Still far too slow, but a little closer.

Sadly, there's "simple" and there's "too simple". To get compile times down that much you're going to have to get rid of a lot of good things, like type checking (in case someone does "unsigned int = x && 1;"), and making sure you don't allow variables to be set to out of range values (in case someone does "char c = EOF;"), and making sure grammar is sane (in case someone forgot a "break;" after a case statement) and all sorts of other things that catch obvious bugs and make programmer's lives easier.

The end user doesn't need all that sanity checking though - it's only really useful to the programmer. What you want is a tool that the programmer uses that does all the sanity checking, and for users to download/install files that have already passed all the sanity checks. Note: This also speeds things up a little for programmers like me, who use the compiler as a sanity checking tool and often compile without executing the result at all.
Clang provides a tool for checking the source code only, it's called ClangCheck, alternatively there's flag to Clang '-fsyntax-only', which does checking only.
Brendan wrote: Of course if the programmers have to run some sort of sanity checking tool anyway; then that tool could also do "pre-compiling" work. Things like stripping comments; and doing constant folding, dead code elimination, inlining functions that are only used once, common subexpression elimination, etc. It could even insert various hints that the end-user's compiler can use, like how likely it is that an "if" statement's condition will be true or whatever.

That way the end-user's compiler doesn't just get a very simple language that it can compile very quickly, it gets a very simple language where half the work is already done. With half the work already done, something like Bochs wouldn't just be compiled in 12 seconds - it'd be down to 6 seconds.
Most compilers have intermediate representation languages, and from what I'm hearing it sounds like you're almost designing such one. Clang provides a 'portable' low-level-code target language (namely the LLVM language), which can at execution time be compiled to native assembly, or run on a virtual machine. I'm curious as to how your approach is different from this.
Brendan wrote: Now let's think about your typical executables - often they're a big blob of code for a single process. For my OS (where the OS decides which computer on the LAN a process should be executed on for load balancing purposes) this isn't ideal. Instead of having one process per application, you want that broken up into several processes per application so that the load can be distributed across several computers. For a GUI application you'd expect at least 2 processes (user interface and "back end"). For something like Bochs I'd be wanting lots of processes (one for each emulated device, and one for the virtual CPU/s and RAM). When something is split into several processes, those processes can be run on different computers, but those executable files can also be compiled on different computers in parallel. That doesn't help when there's only one computer though.

The other thing is that if you're splitting applications into multiple processes, you notice that some can be used by multiple applications. A spell checker is a good example - there's no real reason you can't have one spell checker "service" that's used by many different applications (at that same time). You could have 3 different emulators all sharing the same "virtual PCI device" executables. Mostly anything that does a reasonable amount of processing (enough to justify the IPC costs) can be ripped out and implemented as a service.

You also notice that not all processes are needed when the application starts - you might compile and start a word processor "user interface" process, then (after it started) find out it wants a word processor "back end" process and compile that, then find out it wants a spell checker that's already been compiled because some other application used it before. Something like Bochs might have a "virtual machine manager" that starts first, and while the user is thinking (about whether to drag and drop the "NE2000 ethernet" icon into the virtual machine they're configuring) the OS could be compiling the process that emulates a sound card.

Basically, by ignoring "standard practice", using careful design and doing a few clever tricks; I think I can get just about anything compiled fast enough.
I love the idea of splitting a program into services or tools, and sharing these between several programs, and I'm indeed doing so with my projects, and just to quote the Doug McIlroy;
"Write programs that do one thing and do it well. Write programs to work together."

As for the lazy compilation of units, I kinda like that, at least if it doesn't have result in unpredictable hang times, which I guess can be solved by doing lambda closure on the callable functions and compiling just-before-time.

Also for your project, are you considering have a compiler service running, or?
Brendan wrote:
skeen wrote:
Brendan wrote:So you see a line of code that says "a = c * b;" and you decide it'd be a little nicer in alphabetical order and change it to "a = b * c;". It's just multiplication, so obviously this is fine. You commit it to whatever versioning system (commit note is "Minor code clean ups") and forget about it.

4 weeks later a release is made. When you get the bug report from frustrated users in 6 weeks time (which mostly all just say "code doesn't work right" with no useful details) and then spend 4 entire days trying to find a problem before you eventually discover the matrix multiplication bug; then you can pat yourself on the back and say "that wasn't complex at all!".. :)
First of all, I don't see why changing the order of the arguments should break the code. Multiplication has the commutative property, so should it's implementation, and hence changing the order of the arguments, shouldn't have any effect at all on the output.
Matrix multiplication is not commutative (except for a few special cases).
You're right, and I should stop answering these forums threads at 2am.

However this just means, that because mathematicians haven't been consistent with the commutativity of multiplication, then you can't assume that multiplication is commutative whenever you see it. And you'll have to lookup what the types are. And I guess that you'll have to do the same thing with 'matrix_mul', assuming you're supporting function overloading.
Brendan wrote:
skeen wrote:But assuming that your implementation of the multiplication operator does not do what it should, then I don't see how this is different from; "a = matrix_mul(c,b)" ==> "a = matrix_mul(b,c)", if the implementation of this function doesn't have the commutative property either, then you have the exact same issue.
You don't have the exact same issue. Anyone that knows matrix multiplication isn't commutative wouldn't have created the bug (as you've made it obvious that it's not normal multiplication), and anyone that doesn't know what a matrix is wouldn't have created the bug (as you've made it obvious that it's not normal multiplication). Basically, the bug is still possible, but the chance of the bug actually happening is now a lot lower (as the number of people that now about matrices but also don't know matrix multiplication isn't commutative is very low).
I guess that I'm one of those people, at least at 2am! - However in C++ what you could do, is to have the sizes of a matrix be given by template arguments, that would; 1. Help the compiler generate faster code (as the matrix is now static in size), and 2. Yield a compilation error at the assignment (because you're assigning matrices of different dimensions).

However I agree the issue is still there, whenever you need dynamic sized matrices, but the issue has more to do with the assumption; that multiplication is commutative rather than with the code itself, at least in my opinion (or with the assumption that multiplication is only defined for integers and floats).
Brendan wrote:
skeen wrote:
Brendan wrote:What you're saying is that templates are good as long as you only use existing templates and nobody ever writes them, maintains them, improves them or debugs them.
My point is that, when you're using existing templates, then you don't have to maintain, improve or debug them, and that's good!
Why are you a programmer at all? Surely it's easier to use software that someone else wrote for you; so you don't have to maintain, improve or debug anything!
I agree, it would be easier to get someone else to write my software for me! - So uhm, will you work for free? - Because I doubt all the software of my liking will be created already.

Also, I like to think of my self as the factory worker a little up the conveyor, namely where the product is being assembled, rather than down the conveyor, where the pieces are being made. That is, if somebody else has already created the piece I need, then I'll use it, rather than spend time recreating it. So in short; I put pieces together, that's my thing.

So why am I building an OS, when I could just piece together something; well because it's a hobby OS.
Brendan wrote:
skeen wrote:However I do think you're exaggerating, on how advanced it is to write templates; for containers it's mostly replacing your payload data type with T. For template meta-programming, then yes, things may seem somewhat advances, but this is mainly because you leave OOP and procedural programming behind, and venture into the world of functional programming (which is truly a challenge if you never been there before).
If all you do is replace your payload data type with T, then you've probably ended up with a template that works with some data types (e.g. integers) and fails with others (e.g. strings); and while you may think that was easy some other sucker that tries to use it only to find it doesn't work probably won't think it's so easy.
Why would it break with strings? - The idea of a container is mainly that you store the data, and is able to retrieve it later. Whether I hand you back an int, or a string doesn't matter a lot.
Brendan wrote:
skeen wrote:
Brendan wrote: Ah, I see. Before using "std::list" you send an email to the person that wrote it and let them know which pieces of your data are hot and which pieces are cold; and they optimise the generic implementation for your specific case (and "de-optimise" it for other people's specific cases). That way, you don't have to look at or understand the template, or deal with the complexity of templates, because you're still only using existing templates. Nice!
You do not send an email to the person, who wrote the list, instead you hint the implementation with information, i.e. hint which pieces of the data are cold/hot. One of the things I'm working with currently, is using the profiler's output to generate a header file, which is used to hint my generic libraries of optimization possibilities, in each given usage context.

C++11 provides a powerful way to do this, namely user defined literals, which are pretty much strings, that can be parsed at compile time, and used to create objects. The literals can for instance profiler information.
Given that C (and I assume C++ too) isn't even allowed to do basic things like reorder a structure's members; it must be very simple to use a simple template and a simple user defined literal and make the compiler split a single structure into 2 completely different structures.
C++ allows you to 'mark' you structure members to be reorder, that is via access specifiers. Example;

Code: Select all

struct reorder_members
{
     public: int a;
     public: int b;
};
The compiler is not allowed to reorder members within a access specifier, but it is allowed to reorder access specifier blocks (section 9.2.12 of the ISO standard, if you care).

As for splitting data into two different structures; It's 'just' a matter of doing compile-time reflection on the given data structure, and generating compile-time data structures for hot and cold section data (Remember that C++ templates are Turing Complete, and hence you can generate code for pretty much everything).

The template for doing this, is not super simple, but with the introduction of constexpr functions (C++11) and the relaxation of these (C++14), it has become a lot easier, and can now be done using the procedural paradigm.

Also if this kind of thing is something that is used frequently, it could be candidate to the standard library (likely by Work Group7; Reflection), whereby you don't have to implement it, just use it, or make use of standard data structures, which use it.
Brendan wrote:
skeen wrote: I truly agree that one should be able to understand the code by inspecting a few lines at a time. Nobody has the ability to comprehend the entire complexity of large software projects, and hence we need to hide complexity to the best of our knowledge. However I do not think that "a * b", is any worse than "matrix_mul(a, b)", and as for your example, one could also use a macro, to replace "matrix_mul" with something that does addition.
Hiding complexity just makes finding the complexity more complex! You don't want to hide it, you want to shift it to an obvious place so that it's out of the way but not hidden. A single '*' character doesn't tell you anything is different or where the operator was overloaded.

Note: For my language, there's no pre-processor and no macros either. For a tiny piece of code like "a * b" (and without any other information whatsoever) you know it's definitely normal multiplication, and (if it compiles without error) you also know that both 'a' and 'b' use a numerical type (integer or floating point) and neither can be a pointer, boolean, structure, etc.
I agree, that having the complexity be visible is good, assuming you can deal with the entire complexity of the system, however in large scale software, you simply cannot deal with the complexity of the entire system, and hence you are forced to hide it. This makes it more complex finding the complexity, true, but why would you want to even deal with it, when you have an interface?

If the interface isn't fulfilled then you file a ticket, easy as that; you don't have to deal with the implementation, just the interface! The implementation is somebody else's problem.

As for the multiplication operation, while it doesn't tell you where the operator was overloaded, neither does a function name, and the lookup (for both) can easily be done, via automated tools, to quote yourself; "We have computers with more RAM now", so there's not really a reason not to do so.

And the argument that maybe the language is just soo complex that you need tools to deal with it; well you need an assembler to deal with assembly ;)

Re: Self hosting OS/Compiler from scratch

Posted: Wed Jan 22, 2014 7:45 am
by skeen
Brendan wrote:
Kevin wrote:
Brendan wrote:Do you understand that it's extremely easy to have a brain fart and focus on trivial things (e.g. easily fixed mistakes) while ignore the larger picture (thousands of lines of convoluted "factory factory" crap that requires a degree in astrophysics to get away from quickly enough)?
Where did you ge the "factory factory crap" from?
Surely if one level of stupidity makes things simpler, then 2 levels of stupidity should make things twice as simple. I mean; why write code (templates) that produce code at compile time when you could write code (meta-templates) that produce code (templates) that produce code at compile time?

What we really should be doing is using something like Perl to generate meta-templates that generate templates that generate code; so that we can have a "factory factory factory" and make everything 3 times as simple!
"Everything Should Be Made as Simple as Possible, But Not Simpler" - Quote; Some smart guy

So what you're suggesting against, is that we add a factories which;
  • * Translates from assembly to machine code (one level of stupidity)
    * Translates from some low level portable code to assembly (two levels of stupidity)
    * Translates from some high level code to some low level portable code (three levels of stupidity)
I honestly believe that these three levels of stupidity has helped me out a lot in dealing with complexity alike. I honestly prefer writing C++ over hex, but that could just be my lack of experience 'programming?' in hex, also I guess that portability is going to be an issue with hex.

I do get what your trying to say, but my point is that, we should settle at the abstraction we find the best. If you're happy dealing with hex, then do so, however if you'd rather deal with C, then do so. I personality dislike reinventing the wheel each time I need a list, and hence I like writing generic code. Heck I enjoy writing meta-template programming; I haven't really done a lot of meta-meta programming though.

But if someone was to do meta-n,...,meta-programming and was comfortable doing that, then I'd not be telling them they were doing something wrong; I'd be amazed!
Brendan wrote:
Kevin wrote:
If I have a piece of code that I can see, and you have a piece of code that's 300 times more complex that you can't see, who has more complexity?
I don't agree at all with the notion that a generic list is more complex than one for a specific type of objects. But even if it were so, the 300 times more complex thing is in the standard library, so someone else's problem. I have less complexity to maintain than you have.
Look at this topic's title. If you were designing your own language and writing your own compiler from scratch, then you're the one that's going to be writing those templates. It's not someone else's problem.
While it's true, that the problem would me mine given these circumstances, then I'd rather have to deal with that task when developing my standard library, then push it onto my users (i.e. the people who are going to be missing a list implementation).
Brendan wrote:
Kevin wrote:
I'm not arguing against the reuse of algorithms. I'm arguing for the reuse of algorithms in an open and easily understood, modified and customised form. I'm only arguing against the reuse of algorithms in an overly complicated pain in the neck form that's hidden behind your back in a massive library of puke.
Considering that all of this is in reply to a post where I said that generics are useful, you're saying that generics provide an overly complicated pain in the neck form that's hidden behind your back in a massive library of puke. Did I get that right? Perhaps you'd like to elaborate why you think so.
** PROOF THAT STANDARD LIBRARIES ARE MOSTLY UNREADABLE HERE **
I truly agree that most standard libraries are *UNREADABLE*, expect to their developers, but here's the good part;
You don't have to read it! - The standard sets forth an interface, and all you have to do, is to deal with that interface, the implementation of the interface, is not your issue!

Remember that C++ is a language developed for library writes, that is it's a language designed to allow library writes to build good interfaces, as all you have to deal with is the interface, not the implementation of it.
Brendan wrote:
Kevin wrote: For beginners, most of the pitfalls of C are much more fundamental than that. They're things like confusion about the differences between arrays and pointers, why "x = a * b / c;" is giving them wrong results, accidentally relying on either undefined or implementation defined behaviour, bugs involving misunderstanding operator precedence or promotion rules or order of evaluation, function pointer syntax, all types of "out of bounds" problems, etc.
I didn't pick C as the base because I think it is the greatest language ever, but because you specifically mentioned C as good enough for these beginners. My reasoning is just that when I take a "good enough" language and improve it a little, it must still be good enough.
Your reasoning is that if a language is "good enough for people to learn quickly", you'll make it "better" by placing a huge hurdle in there to slow beginners down.

The thing is, you don't want beginners to deal with writing generics either. You just want shrink wrapped code that they can use but not read or write. In that case, build things like containers into the language itself instead of using some Rube Goldberg contraption to duct-tape them onto the side. It certainly wouldn't be the first language to have something like linked lists as a built in feature.
If you were to actually include the standard library as a built in feature, you would;
  • * Not be a systems programming language any more (as you'd need runtime support)
    * Not be able to change the implementation of the standard library, without changing compiler
    * Break the unix philosophy, by doing combining separate responsibilities.
    * ... the list goes one
Remember; There's a reason why the separation was made in the first place!

Ask yourself is that why should the compiler deal with things, which can as easily be dealt with by the standard library? - just such that you don't have to write "#include <stdio.h>"? - Well have the compiler put that for you then ;)
Brendan wrote:
Somehow you seem to think that generics are inherently hard to use correctly and automatically increase complexity by a huge factor. I'm not sure why that is, I never had a lot of trouble with them. I have used the Ada and Java (perhaps Free Pascal as well, don't remember) flavours of them.
Words like "huge" and "hard" are relative. A drop of water is huge and hard (if you're an ant moving at super-sonic speeds) and you'll find that cars are small and soft if you drop a boulder from plane.

What I do think is that the benefits are not worth the increase in difficulty/complexity; regardless of how large/small the increase is.
Just because you don't see the benefits of generics, does this mean it's always bad?

Re: Self hosting OS/Compiler from scratch

Posted: Wed Jan 22, 2014 8:07 am
by bwat
skeen wrote: But if someone was to do meta-n,...,meta-programming and was comfortable doing that, then I'd not be telling them they were doing something wrong; I'd be amazed!
Some people are doing this quite often. The following is from Tracing Requirements for Multi-layered Meta-programming, Bowles and Wilk, which I found in Meta-Programming in Logic Programming, MIT Press
This paper discusses the requirements for a programming methodology often used in Artificial Intelligence (AI) which we call Multi-layered Meta-programming. In this methodology, rather than implementing a solution in some base language, layers of high level languages are rapidly designed and implemented through experimental AI programming.
...
A programming methodology often used in AI, as well as in other fields, is stratified or layered design. When faced with a problem, rather than implementing a solution directly in some base language, a new, more appropriate, higher level language is designed in which to express a solution, and this is implemented in the base language. This methodology can have several iterations yielding as a final solution a hierarchy of different languages each more appropriate to expressing the solution than the language below.
They also go on to note that this is not a new thing.

I do it when I want new control structures: new data types = new library, new control structure = new language. I should add that I use languages that make this sort of thing easy. I wouldn't do it much in C though others have: lex, yacc, etc.

Re: Self hosting OS/Compiler from scratch

Posted: Wed Jan 22, 2014 11:07 am
by Brendan
Hi,
Kevin wrote:
Brendan wrote:Now let's consider a reasonably large application, like Bochs. That's a little more than 4 MiB of (compressed) source code and probably takes around 10 minutes to compile on my machine; but my machine is slightly faster than an old 80486. If we assume the user has half the CPU power that I do (e.g. a quad-core i7 rather than a pair of quad-core Xeons) we could assume it might take 20 minutes (or 1200 seconds). I want that cut down to about 4 seconds, or about 600 times faster than GCC. Sounds easy enough!
You're making up numbers out of thin air in your whole posting.
Yes, but who cares? I'm only describing an idea. I shouldn't need accurate numbers for that.
Kevin wrote:This one is easy enough to try out:

Code: Select all

$ time make -j4
...
real    0m19.122s
user    1m4.133s
sys     0m4.471s
Your factors of 10 can't be tried out because both your tools and the Bochs source in the modified language are missing. I suspect they aren't much more accurate than this one.
I'm not too sure what you're trying to say here. If you're suggesting that many separate translation units compiled in parallel is faster (on multi-CPU) then you'd be wrong - you can use any number of ways to exploit parallelism, and split to work in different ways at different stages (e.g. do "functions" in parallel in early stages, then do "output file sections" in parallel in late stages).
Kevin wrote:
That way the end-user's compiler doesn't just get a very simple language that it can compile very quickly, it gets a very simple language where half the work is already done. With half the work already done, something like Bochs wouldn't just be compiled in 12 seconds - it'd be down to 6 seconds.
True. But now the end user doesn't have source code any more, but an intermediate representation. Again, you haven't got rid of complexity, you just moved it to somewhere else. (Which is fine, compiling source code shouldn't be the job of the end user. There are reasons why all widespread OSes and applications come in binary form and source is only additional, if available at all.)
Yes, you're right. Often the best solution to a problem is to change the problem. ;)
Kevin wrote:The only problem is that now you shouldn't be comparing to a real compile process any more, but rather to the startup of a Java application. You can, of course, keep comparing to a full compile, and your solution will look great, but that's self-deceiving.
Java uses JIT, which is quite different to AOT.

Self-deception is for amateurs - if you want to be a good OS developer you need to aim higher. You need to deceive applications (Cool, my process has 3 GiB to play with!) and deceive users (Cool, a single CPU can run multiple programs at the same time!). In this case the intent is to deceive users and make them think they've downloaded/installed an executable (when they haven't) and also make them think that the same executable they download can run "as is" on very different types of CPUs. Of course we can also deceive programmers (e.g. make them think they're typing text when they're actually modifying what a traditional compiler's initial pass/es would have produced).
Kevin wrote:
For something like Bochs I'd be wanting lots of processes (one for each emulated device, and one for the virtual CPU/s and RAM). When something is split into several processes, those processes can be run on different computers, but those executable files can also be compiled on different computers in parallel. That doesn't help when there's only one computer though.
This is the final step that is still missing to slow down Bochs to a level that it runs backwards.
If you implemented it, the overhead of IPC would probably be greater than the gains from exploiting more parallelism, and you'd probably right (it'd make things slower). Of course that only applies if you implemented it. If I implemented it (e.g. a micro-kernel designed specifically for asynchronous message passing, where "asynchronous messages between processes" is used to emulate the hardware's "asynchronous messages across a PCI bus") the performance would improve. 8)
Kevin wrote:
The other thing is that if you're splitting applications into multiple processes, you notice that some can be used by multiple applications. A spell checker is a good example - there's no real reason you can't have one spell checker "service" that's used by many different applications (at that same time). You could have 3 different emulators all sharing the same "virtual PCI device" executables. Mostly anything that does a reasonable amount of processing (enough to justify the IPC costs) can be ripped out and implemented as a service.
That's a great idea. Mainstream OSes call it shared libraries. (You can, of course, use other protocols and use it over the network if you like to slow it down. The OS could even make it transparent, except for the timing.)
A car is exactly the same as a boat (if you ignore significant differences and slap after-market floatation devices onto the car, in a desperate attempt to belittle the advantages that boats have when used for the purpose they're designed for) except that boats are really bad at travelling across land (which is something boats were never intended to be used for in the first place).


Cheers,

Brendan

Re: Self hosting OS/Compiler from scratch

Posted: Thu Jan 23, 2014 3:46 am
by brunexgeek
Joshw wrote:For a long time I've always wanted to achieve the lofty goal of creating a completely self-hosting operating system and compiler from scratch. No external libraries would be used or ported.
Brendan wrote:I'm mostly attempting to do the same - design my language, build initial tools, write an OS in my language, then write native tools for that OS.
I have a similar project idea. In fact, the initial idea was to create a translator from a subset of Java to C. Thus, it would be possible to generate C code from Java programs/libraries and compile them with GCC or CLang (mature compilers with a lot of optimizations I don't need to implement myself). That C code is just a "intermediate language" (the programmer does not need know about it). Another "feature" I though is include some keyword in the Java subset to export static methods as C functions, allowing programs in other languages to use the "Java" library too (using the "C data types").

Thus I could implement a whole operating system in Java and still allow programmers to create programs in other languages.

Re: Self hosting OS/Compiler from scratch

Posted: Thu Jan 23, 2014 4:28 am
by HoTT
Thus I could implement a whole operating system in Java and still allow programmers to create programs in other languages.
What do you expect to gain from this? Keep in mind: Java has no value types except for primitives, it's semantic relies on a garbage collector, so no way around it, it heap allocates everything, so you'll need a sophisticated optimizer _before_ generating c-code, last time I looked the layout of java classes was not specified etc ..

Re: Self hosting OS/Compiler from scratch

Posted: Thu Jan 23, 2014 5:49 am
by Kevin
brunexgeek wrote:I have a similar project idea. In fact, the initial idea was to create a translator from a subset of Java to C. Thus, it would be possible to generate C code from Java programs/libraries and compile them with GCC or CLang (mature compilers with a lot of optimizations I don't need to implement myself). That C code is just a "intermediate language" (the programmer does not need know about it).
Did you consider using gcj? Then you don't have the unnecessary step with C in the middle, but simply another GCC backend.
Another "feature" I though is include some keyword in the Java subset to export static methods as C functions, allowing programs in other languages to use the "Java" library too (using the "C data types").
Interfacing between C and Java is not completely straightforward with gcj because Java can't use C types, but you can declare things as extern "Java" and use them in C(++) code.

Re: Self hosting OS/Compiler from scratch

Posted: Thu Jan 23, 2014 11:40 am
by Brendan
Hi,
skeen wrote:
Brendan wrote:By forgetting about "standard practice" it'd be easy to slash compile times by a factor of 10. Instead of compiling something like Bochs in 1200 seconds that brings it down to 120 seconds. Still far too slow, but a little closer.
As I started before, I don't mind my compile times being long, if the incremental compile time is low (and thereby the debug cycle time). Having a low incremental compile time, can be achieved for instance by compiling small potions of the code at a time. This may be troublesome, whenever you need to recompile the hole program, however this is something you should only do rarely, and whenever you have a break off, ect.
As an end-user that uses Gentoo (where everything is compiled from source from scratch when installed or updated), I do mind long compile times. What I like is knowing that everything has been optimised specifically for my CPUs (and not just optimised for "generic 64-bit 80x86"), but for the way its been done the "compile times" disadvantage completely and utterly obliterates any "optimised native code" advantage.
skeen wrote:That being said; we can easily agree, that compilers and some languages are indeed designed for how the world was back in 1970. However providing a nice type system, and static analysis is indeed something that was not entirely possible back in 1970, and is one of the main areas to look out for in compilers in the years to come. As this is where bad programmers like me gets help and guidance, I'm merely a human being, and hence I do write buggy code, so whenever there's a tool to tell me I did something wrong; I welcome it.

Clang is currently working on a list of helpful tools, called sanitizers, which do a lot of static and compiler enabled runtime analysis of the code. Their goal is to catch 80% of bugs with static analysis and the rest via runtime analysis.

Chandler Carruth did a presentation about this, at Going Native last year. If you guys haven't checked it out, consider doing so, the talk can be found at;
http://channel9.msdn.com/Events/GoingNa ... -s-Dragons
When I watch videos like this, it's like I have a set of checkboxes in my brain. For each of the problems they mention I figure out if it's something that could effect people writing software for my OS using my tools and my language.

For the syntax reformatting, my IDE works on tokens. You modify the source code and it modifies the tokens as soon as your cursor leaves that line of "text", then the IDE updates the screen by looking at the tokens and deciding how to format/display it. There is no whitespace or anything in the tokens. If a line is too long and needs to be split then the line is split (but the line numbers don't change - line number 123 can be displayed as 4 lines of text on the screen and the next line on the screen is still line 124). If you resize the IDE's window, everything you see is being reformated to suit the size of the window while the window is being resized (the "80 column rule" is for suckers). Of course I don't need to worry about tools to update source code from an old version of the language to a new version of the language either.

I've written code reformatting software before, for both C and assembly (specifically, trying to convert "unformatted" source code into perfectly formatted HTML). I know it's hard. There's a lot of annoying corner-cases that you don't really think about until you try it. My language has (almost?) none of those annoying corner-cases - there's no macros, there's no '\' line continuation, there's no escape codes in string literals, there's no multi-line comments, etc.

For static analysis, there's almost everything in that video is either not possible in my language or not allowed by my language. Most of the undefined or implementation defined behaviour simply doesn't exist (e.g. there are no uninitialised local variables, the behaviour of "signed shift right" is defined, etc). Then there's things that the compiler must check - overflows aren't permitted, precision loss isn't permitted (unless a variable is explicitly marked with "approx" to indicate it's an approximation), it's almost impossible to get operator precedence wrong (e.g. "1 & 2 | 3" causes a compile time error about inadequate parenthese), you can't mix boolean and numerical expressions (e.g. a typo like "(1 && 2) | 3" is a compile time error), there's special "rigid structures" (with strict guarantees about the layout in memory) to make serialisation/deserialisation (which is something lots of people get wrong) very simple, etc.

For thread safety, for applications and device drivers, I'm mostly using a "shared nothing" system of communicating threads. Because it's "shared nothing" there's no need for locks or anything, and there's nothing to get wrong in the first place. This also means that programmers must get used to doing everything with an event driven programming model (but it'll be worth it!).

That leaves dodgy memory accesses. A (small) number of these are "array bounds" problems and are prevented by the same logic that prevents overflows. Essentially, an "array" (actually called a bounded pointer for my language) is an integer that has a specific range of valid values/addresses, and (just like normal variables) isn't allowed to be out of range. Basically, arrays are 100% safe (note: there are a few other restrictions involved to ensure arrays are safe, and these other restrictions effect the way people do things like memory management, but it works out fine in the end). Note: Strictly speaking, I have no arrays and only have "bounded pointers" and "unbounded pointers". This is to help avoid a common cause of confusion that beginners have in languages like C.

That only really leaves 2 major problem areas: (unbounded) pointers, and "malloc-style" memory management. I haven't been able to find a way to avoid these problems (at least, none that I'd be willing to tolerate for things like kernel and device driver development). The only options I can think of are to use dynamic analysis (e.g. a tool like valgrind) or to have some sort of safe language (possibly something very similar; but with pointers and assembly language removed, and "references" and garbage collection added). In practice, both of these solutions are "I can ignore it for now and worry about it later".

skeen wrote:Most compilers have intermediate representation languages, and from what I'm hearing it sounds like you're almost designing such one. Clang provides a 'portable' low-level-code target language (namely the LLVM language), which can at execution time be compiled to native assembly, or run on a virtual machine. I'm curious as to how your approach is different from this.
The initial plan was for the IDE to create "source code in tokenised form", and for the first compiler to produce a type of three address code as an intermediate representation. That idea died after an initial prototype.

What I found was that "source code in tokenised form" had better density (smaller downloads for end users); and that by reusing a subset of the same set of tokens I was using for source code as an intermediate language:
  • a significant amount of tokens can pass through the first compiler unmodified
  • it made debugging the compiler easier (the same "dump tokens as text to a file" code can be used to inspect things between any passes)
  • tokens seems just as easy for doing sanity checking and high level optimisations
  • it's relatively easy/fast for the second compiler to convert it to any form it likes (e.g. SSA) for its low level optimisations anyway
Of course as far as I'm concerned this is all still research - it might turn out bad and it might turn out awesome (but I just discard/redesign things that turn out bad - it's a brute force approach that guarantees everything will be awesome eventually ;) ).
skeen wrote:I love the idea of splitting a program into services or tools, and sharing these between several programs, and I'm indeed doing so with my projects, and just to quote the Doug McIlroy;
"Write programs that do one thing and do it well. Write programs to work together."

As for the lazy compilation of units, I kinda like that, at least if it doesn't have result in unpredictable hang times, which I guess can be solved by doing lambda closure on the callable functions and compiling just-before-time.
For a distributed system you have to take into consideration things like networking failures. This means that a process that uses a service has to be able to deal with the service failing at any time while its running. From the perspective of a process using a service; there's little practical difference between a service that fails immediately after it starts, a service that compiles but fails to start, and a service that fails to compile.
skeen wrote:Also for your project, are you considering have a compiler service running, or?
Both compilers (source->IR and IR->native) would be implemented as file format converters; where a file format converter is type of service that is used by the virtual file system layer when a file of one type is opened as a file of a different type.

Of course the VFS would cache the results of file format conversions in the file system, and be able to create chains of multiple file format converters. This means there's a search pattern involved - e.g. try to find the minimum number of converters that can be used to convert the original file's format (or an existing cached result of a previous file format conversion) into the requested/target file format. For example, if the OS's process loader asks VFS to open "myProject.src" as a native executable, then VFS might find a cached version and do no conversion, or might find a cached IR version and do "IR -> native", or might find nothing cached and try to find a way to do "src -> ? -> native" (where '?' could be any file format) and end up doing "src -> IR -> native".

skeen wrote:
Brendan wrote:Matrix multiplication is not commutative (except for a few special cases).
You're right, and I should stop answering these forums threads at 2am.

However this just means, that because mathematicians haven't been consistent with the commutativity of multiplication, then you can't assume that multiplication is commutative whenever you see it. And you'll have to lookup what the types are. And I guess that you'll have to do the same thing with 'matrix_mul', assuming you're supporting function overloading.
Yes, you can't assume multiplication is commutative whenever you see it (for any language stupid enough to allow operator overloading) - you can't even assume that '*' has anything to do with multiplication of any kind (in the same way that you can't assume that "<<" is a left shift and not potential file IO).

Function overloading is identical to operator overloading for all practical purposes - the only real difference is syntax (prefix notation vs. infix notation). For example, the function "foo(1,0)" is the same as the operator "(1 foo 0)".

skeen wrote:
Brendan wrote:Why are you a programmer at all? Surely it's easier to use software that someone else wrote for you; so you don't have to maintain, improve or debug anything!
I agree, it would be easier to get someone else to write my software for me! - So uhm, will you work for free? - Because I doubt all the software of my liking will be created already.

Also, I like to think of my self as the factory worker a little up the conveyor, namely where the product is being assembled, rather than down the conveyor, where the pieces are being made. That is, if somebody else has already created the piece I need, then I'll use it, rather than spend time recreating it. So in short; I put pieces together, that's my thing.
I like to see myself as being responsible for the quality of the final product, which requires being able to move to higher or lower levels quickly. I don't want to drop down into an over-engineered mess and spend hours struggling to get out.
skeen wrote:
Brendan wrote:If all you do is replace your payload data type with T, then you've probably ended up with a template that works with some data types (e.g. integers) and fails with others (e.g. strings); and while you may think that was easy some other sucker that tries to use it only to find it doesn't work probably won't think it's so easy.
Why would it break with strings? - The idea of a container is mainly that you store the data, and is able to retrieve it later. Whether I hand you back an int, or a string doesn't matter a lot.
Why wouldn't it break with strings? For integers, something like "if(400 < 100)" makes sense. For strings, something like "if( "400 BC" < 100)" has several different (incompatible) possibilities for what is/isn't considered correct behaviour (is the string less than 100 characters, is the year 400 BC before the year 100 AD, does the pointer point to a virtual address that's lower than 100, etc).
skeen wrote:
Brendan wrote:Hiding complexity just makes finding the complexity more complex! You don't want to hide it, you want to shift it to an obvious place so that it's out of the way but not hidden. A single '*' character doesn't tell you anything is different or where the operator was overloaded.

Note: For my language, there's no pre-processor and no macros either. For a tiny piece of code like "a * b" (and without any other information whatsoever) you know it's definitely normal multiplication, and (if it compiles without error) you also know that both 'a' and 'b' use a numerical type (integer or floating point) and neither can be a pointer, boolean, structure, etc.
I agree, that having the complexity be visible is good, assuming you can deal with the entire complexity of the system, however in large scale software, you simply cannot deal with the complexity of the entire system, and hence you are forced to hide it. This makes it more complex finding the complexity, true, but why would you want to even deal with it, when you have an interface?
For large scale software, you simply cannot deal with the complexity of the entire system because people keep trying to hide things (instead of merely shifting them where they're easily found if you need to find them) and because the things they use to hide complexity increase complexity further. Why would you want to add the hassle of interfaces on top of an already complex mess?
skeen wrote:If the interface isn't fulfilled then you file a ticket, easy as that; you don't have to deal with the implementation, just the interface! The implementation is somebody else's problem.
Sure, everything that makes anything more complex is always someone else's problem...
skeen wrote:As for the multiplication operation, while it doesn't tell you where the operator was overloaded, neither does a function name, and the lookup (for both) can easily be done, via automated tools, to quote yourself; "We have computers with more RAM now", so there's not really a reason not to do so.
For C, I always have a "where" prefix on everything that can be used by code in another file. For example, I might have a file called "foo.c" with a function "FOO_bar()". It's not ideal, but it's effective.

For my language there's "units" instead of files, and to access anything in another unit you need to use the unit name. For example, I might have a unit called "foo" that contains the function "bar()", and to use that function from another unit you'd be forced to write "foo.bar()" (and if you only use "bar()" then you get a compile-time error, unless there happens to be a function in the same unit as you that's also called "bar()").
skeen wrote:And the argument that maybe the language is just soo complex that you need tools to deal with it; well you need an assembler to deal with assembly ;)
For understanding assembly, syntax highlighting is nice (but entirely optional). There is nothing else that I'm aware of (e.g. no complex IDE with all sorts of features) because nobody ever really needed anything else. How many people would work on a large project written in C++ with nothing more than syntax highlighting?


Cheers,

Brendan