Page 6 of 9

Re: Self hosting OS/Compiler from scratch

Posted: Mon Jan 27, 2014 5:47 am
by Brendan
Hi,

Kevin decided to throw an "alien" language at me, to see how easy/hard it is for someone that's never seen the language before to try to read/understand. I was thinking of showing an "alien" language of my own to see how easily people can read that, but got lazy and didn't. Now Combuster thinks I only ever use assembly and I'm uncomfortable with anything else.

So, here's an example of an "alien" language; that does (what I think) Kevin's sorting example does. Please note that there are no generics and no pre-processing. It's 3 separate pieces of code - you can think of them as 3 separate files if you like.

Integers:

Code: Select all

function sortIntegers (u32 @array, range 1 to (ums.max/u32.size) items) (void) {
    auto src
    auto temp

    while(items > 0) {
        src = 1;
        while(src < items) {
           if( @array > @(array + src) ) {
               temp = @array
               @array = @(array + src)
               @(array + src) = temp
           }
           src = src + 1
        }
        array = array + 1
        item = item - 1
    }
}
Strings:

Code: Select all

functiondef stringComparator = (u32 @first, u32 @second) (range -1 to 1)

function sortStrings (u32 @@array, range 1 to (ums.max/(u32 @).size) items, stringComparator comparator) (void) {
    auto src
    auto temp

    while(items > 0) {
        src = 1;
        while(src < items) {
           if( comparator(@array, @(array + src))  < 0 ) {
               temp = @array
               @array = @(array + src)
               @(array + src) = temp
           }
           src = src + 1
        }
        array = array + 1
        item = item - 1
    }
}
Structures:

Code: Select all

typedef TestStruct = {
    u8 a
    u8 b
}

function sortStructures (TestStruct @array, range 1 to (ums.max/TestStruct.size) items) (void) {
    auto src
    TestStruct temp

    while(items > 0) {
        src = 1;
        while(src < items) {
           if( @array.a + @array.b > @(array + src).a + @(array + src).b ) {
               temp.a = @array.a
               temp.b = @array.b
               @array.a = @(array + src).a
               @array.b = @(array + src).b
               @(array + src).a = temp.a
               @(array + src).b = temp.b
           }
           src = src + 1
        }
        array = array + 1
        item = item - 1
    }
}
The nice thing here is that each piece is easy to understand (not splattered everywhere, or buried under excess boilerplate, or complicated by the need to handle many different data types). More importantly, it's easy to modify. For example, you could look at the integer sorting code and decide to do something like this to it:

Code: Select all

multitarget sortIntegers (u32 @array, range 1 to (ums.max/u32.size) items) (void) {

    function sortIntegers (u32 @array, range 1 to (ums.max/u32.size) items) (void) {
        auto src
        auto temp

        while(items > 0) {
            src = 1;
            while(src < items) {
               if( @array > @(array + src) ) {
                   temp = @array
                   @array = @(array + src)
                   @(array + src) = temp
               }
               src = src + 1
            }
            array = array + 1
            item = item - 1
        }
    }

    asmfunction x86_32 sortIntegers (u32 @array in edi, range 1 to (ums.max/u32.size) items in edx) (void) {
    l1:
        mov ecx,edx-1
        mov esi,edi+4
    l2:
        mov eax,[esi]
        cmp eax,[edi]
        jae l3
        mov ebx,[edi]
        mov [edi],eax
        mov [esi],ebx
    l3:
       add esi,4
       loop .l2

       add edi,4
       sub edx,1
       jne l1
    }

}
If you think you can do something like this with generics, you're wrong. If you think assembly isn't that important, then perhaps I should point out that (even for my "simple" language) I'm expecting it to take 20 years before my native compiler is able to match the code quality produced by LLVM and GCC (partly because I need to write a disposable/initial compiler before I can write an OS, before I can write the native compiler, before I can write the IDE that the language is designed for, before I can begin writing code optimisers for the native compiler).

So, generics. They suck, I don't want them, I've never wanted them, and I don't want anyone using my language or writing software for my OS to use them either; and if I really loved generics I probably couldn't afford the development time they'd cost to support anyway.

As an alternative, people can use my language and compiler to write utilities that generate source code. In this case; beginners don't need to learn anything extra; people can easy check the "software generated source code" for debugging, etc; I won't need to spend a few extra years trying to support complex puke; and it's far more powerful than generics in any language ever will be.


Cheers,

Brendan

Re: Self hosting OS/Compiler from scratch

Posted: Mon Jan 27, 2014 6:32 am
by Combuster
Now Combuster thinks I only ever use assembly and I'm uncomfortable with anything else.
I would think your discomfort has been demonstrated enough by now ;)
Brendan wrote:Integers Strings Structures
So in the end, the real problem is that you prefer fixing bugs in all copies of an algorithm (one for each type) rather than just one function that works for each type. Also, the only thing that makes these functions differ - and where the type intelligence is - is hidden within the bowels of each function. It may look easy for textbook sort, but you can have fun finding it when you don't know the code.

For instance, your sort implementation is horribly slow (those pesky gnomes!). I'm going to be much faster to fix that issue with my generics version compared to the separate version. :wink:

Re: Self hosting OS/Compiler from scratch

Posted: Mon Jan 27, 2014 6:47 am
by Kevin
Brendan wrote:So, here's an example of an "alien" language; that does (what I think) Kevin's sorting example does. Please note that there are no generics and no pre-processing. It's 3 separate pieces of code - you can think of them as 3 separate files if you like.
Okay. So I guess you want me to tell you what my reactions to this code are.

So the first thing I did was obviously looking at the syntax of the first example and trying to figure it out. The first line took me a minute or two to understand because at first I thought that this "range" thing would specify the array type instead of being the type for the next parameter. I don't know what exactly "ums" is, but it must be some integer type. The rest of the syntax and standard types is easy enough to understand, even though I won't claim that I can write correct code in it.

The second thing I did more or less automatically in order to understand the code was comparing the first function to the second one, and then the second one to the third one. I had to do this because the differences aren't explicit, they are hidden in the copy of the algorithm and so I need to read three copies of the code to be sure that there are no surprises in one of them. If I wanted to switch to quicksort, I would have to implement quick sort three times, with three times as many chances to introduce errors (if only by a typo) and three times as much effort to verify correctness. In my Ada code I had one implementation of the algorithm that could be verified independently of the data types involved, and then only the differing parts are declared separately.

You made me do additional work, and that is the complexity that is in code duplication.
If you think you can do something like this with generics, you're wrong.
Why? Sure, the assembly version won't be generic. Using assembly means giving up almost all abstractions for some additional performance, at the cost of readability. So that part is expected. I can still get rid of the other half of the code if I extend your syntax for generics:

Code: Select all

with generic (type T) functiondef sort (T @array, range 1 to (ums.max/T.size)) (void)

multitarget sortIntegers (u32 @array, range 1 to (ums.max/u32.size) items) (void) {

    function sortIntegers (u32 @array, range 1 to (ums.max/u32.size) items) (void)
        = specialise sort(u32);

    asmfunction x86_32 sortIntegers (u32 @array in edi, range 1 to (ums.max/u32.size) items in edx) (void) {
    l1:
        mov ecx,edx-1
        mov esi,edi+4
    l2:
        mov eax,[esi]
        cmp eax,[edi]
        jae l3
        mov ebx,[edi]
        mov [edi],eax
        mov [esi],ebx
    l3:
       add esi,4
       loop .l2

       add edi,4
       sub edx,1
       jne l1
    }

}

Re: Self hosting OS/Compiler from scratch

Posted: Mon Jan 27, 2014 7:06 am
by Owen
Funny thing is, I could implement an assembly optimized version in C++ too (Modulo the lack of a standard inline assembly construct, so we will deal with GCC's for now):

Code: Select all

template<>
void sort<int*>(int *first, int *end)
{
    __asm(
        "# Implement introsort here"
        : [first] "+r" (first), [end] "+r" (end));
}
Of course, you probably wouldn't want such a specialized form of std::sort (the compiler will do the job better, and that particular specialization won't work with, say, std::vector iterators, though you could make it do so with a little work)

My sorting code is all centralized in one location (My implementation of std::sort), and my default comparison code is all centralized in appropriate locations (e.g. the definition of std::string)

Of course, C++ lets me do useful things like this as well:

Code: Select all

struct DistComparator 
{
    glm::ivec2 m_cameraPos;

    DistComparator(const glm::ivec2& cameraPos) : m_cameraPos(pos) {}

    inline bool operator()(const Cell *lhs, const Cell *rhs) 
    {
        glm::ivec2 l = (lhs->position() - m_cameraPos);
        glm::ivec2 r = (rhs->position() - m_cameraPos);

        // Square both (* = component wise multiply)
        l *= l; r *= r;

        // Compare the squared lengths

        return l.x + l.y < r.x + r.y;
    }
};

// Slightly later in the code...
std::sort(std::begin(cells), std::end(cells), DistanceComparator(origin));
(That's an actual recreation of something I wrote for a 3D engine. Yes, recomputing the squares each time made performance sense. And the compiler even managed to vectorize it! No, reimplementing intro sort my self, or copying the code, didn't make sense)

Re: Self hosting OS/Compiler from scratch

Posted: Mon Jan 27, 2014 7:12 am
by HoTT
Given those three sorting routines in both generic and non-generic versions: Now sort in reverse order.

Re: Self hosting OS/Compiler from scratch

Posted: Mon Jan 27, 2014 7:20 am
by Owen
HoTT wrote:Given those three sorting routines in both generic and non-generic versions: Now sort in reverse order.
Too easy for the C++ template case!

Code: Select all

template<typename Iterator, typename Comparator>
void reverse_sort(const Iterator &begin, const Iterator &end, const Comparator& cmp)
{
    typedef typename Iterator::value_type val;
    std::sort(begin, end, [&](const val& lhs, const val& rhs) -> bool { return !cmp(lhs, rhs); });
}

template<typename Iterator>
void reverse_sort(const Iterator& begin, const Iterator& end)
{
    return reverse_sort(begin, end, std::less<typename Iterator::value_type>());
}

Re: Self hosting OS/Compiler from scratch

Posted: Mon Jan 27, 2014 7:43 am
by Brendan
Hi,
Kevin wrote:
Brendan wrote:So, here's an example of an "alien" language; that does (what I think) Kevin's sorting example does. Please note that there are no generics and no pre-processing. It's 3 separate pieces of code - you can think of them as 3 separate files if you like.
Okay. So I guess you want me to tell you what my reactions to this code are.

So the first thing I did was obviously looking at the syntax of the first example and trying to figure it out. The first line took me a minute or two to understand because at first I thought that this "range" thing would specify the array type instead of being the type for the next parameter. I don't know what exactly "ums" is, but it must be some integer type. The rest of the syntax and standard types is easy enough to understand, even though I won't claim that I can write correct code in it.
The "ums" (short for "unsigned memory size") is a data type guaranteed to be able to store all usable addresses (e.g. ranging from 0 to about 3 GiB on 32-bit systems). The "ums.max" gets the data type's maximum value, and dividing by the size of some other data type gets you the maximum number of items you could possibly ever have; so "ums.max/u32.size" is the maximum number of unsigned 32-bit integers you could have. Something like "range 1 to (ums.max/u32.size) items" means that the variable ("items") can have a range from 1 to the maximum possible number of 32-bit integers; and if someone tries to call the function with zero items or 4 billion items they get a compile-time error.
Kevin wrote:The second thing I did more or less automatically in order to understand the code was comparing the first function to the second one, and then the second one to the third one. I had to do this because the differences aren't explicit, they are hidden in the copy of the algorithm and so I need to read three copies of the code to be sure that there are no surprises in one of them.
You shouldn't have needed to compare them at all.
Kevin wrote:If I wanted to switch to quicksort, I would have to implement quick sort three times, with three times as many chances to introduce errors (if only by a typo) and three times as much effort to verify correctness. In my Ada code I had one implementation of the algorithm that could be verified independently of the data types involved, and then only the differing parts are declared separately.
For the strings version, almost all the overhead is going to be in the code that compares unicode strings, so you wouldn't bother optimising that at all. Maybe for the structures you figure out it's smarter to do an insertion sort when the array is created (so you don't need to sort it afterwards at all) or use some sort of tree instead of using arrays. For the integers, maybe you've got some huge arrays and need threads that nothing else needed plus some small arrays where the existing code is fine, and you end up with 2 completely different functions for different cases.

Basically; if you're actually optimising at all, then only a fool would use the exact same thing for very different cases, the chance of ending up needing something almost the same for different data types is fairly small, and in the few cases that are left nobody sane cares (it's easier to not bother figuring out which cases are similar enough to worry about and just do what is right for each special case).
Kevin wrote:You made me do additional work, and that is the complexity that is in code duplication.
Normally a programmer might (e.g.) be working on something involving strings and only care about the string sorting. Then 4 days later they might be doing something with integers and only care about integer sorting. A normal programmer doesn't suddenly need to look at all of them at the same time.

Also, you're still failing to understand the difference between "more lines of simple code" and "fewer lines of more complex code". None of it is complex, it's just more lines of simple code (e.g. the only part that was a little tricky for you to understand was the "ums" stuff, which is only because you've never seen the language before).
If you think you can do something like this with generics, you're wrong.
Why? Sure, the assembly version won't be generic. Using assembly means giving up almost all abstractions for some additional performance, at the cost of readability. So that part is expected. I can still get rid of the other half of the code if I extend your syntax for generics:[/quote]

Great. Now a programmer that only cares about integer sorting has to wade through several different files of puke just to figure out where the code they care about went; and they can't easily compare the high level version with their assembly version.


Cheers,

Brendan

Re: Self hosting OS/Compiler from scratch

Posted: Mon Jan 27, 2014 8:04 am
by Combuster
Brendan wrote:
Kevin wrote:You made me do additional work, and that is the complexity that is in code duplication.
Normally a programmer might (e.g.) be working on something involving strings and only care about the string sorting. Then 4 days later they might be doing something with integers and only care about integer sorting. A normal programmer doesn't suddenly need to look at all of them at the same time.
If there's any problem with the sorting, the obvious solution is to look at the others and see if they have been fixed or suffer from the same problem.

Strictly following your approach description, structure sorting wouldn't have gotten either fix, and consequently would give you bugs later.

Re: Self hosting OS/Compiler from scratch

Posted: Mon Jan 27, 2014 8:12 am
by Brendan
Hi,
Owen wrote:Funny thing is, I could implement an assembly optimized version in C++ too (Modulo the lack of a standard inline assembly construct, so we will deal with GCC's for now):
Can you have 6 different versions in assembly language (e.g. one for 32-bit 80x86 without SSE, one for 32-bit 80x86 with SSE, a few for 64-bit 80x86, some for ARM, etc); and have all of them grouped next with the high level code so that it's obvious what the intended functionality is? Will the compiler be able to compile the high level version and one or more assembly versions and make sure they both function the same?
Owen wrote:Of course, you probably wouldn't want such a specialized form of std::sort (the compiler will do the job better, and that particular specialization won't work with, say, std::vector iterators, though you could make it do so with a little work)
So now the template that was supposed to avoid code duplication ends up being 3 or more specialisations? Well, that certainly was useful...
Owen wrote:My sorting code is all centralized in one location (My implementation of std::sort), and my default comparison code is all centralized in appropriate locations (e.g. the definition of std::string)
Your integer handling code is spread everywhere (several templates for searching, sorting, etc, all in different places).

If you decide that for strings you're doing a lot of lookups/searching and want to switch to a hash or something instead of an array (and end up wanting a merge sort for the rare cases that you do sort) then guess what, it's all spread everywhere. It's completely idiotic.
Owen wrote:Of course, C++ lets me do useful things like this as well:
Owen wrote:(That's an actual recreation of something I wrote for a 3D engine. Yes, recomputing the squares each time made performance sense. And the compiler even managed to vectorize it! No, reimplementing intro sort my self, or copying the code, didn't make sense)
And how long did it take you to learn C++, write any/all libraries you've used, write your C++ compiler twice from scratch, and implement the optimiser that did the vectorisation?

Did your compiler tell you there's a potential bug on every single line of that code? Hint: See if you can write a list of all the cases that overflows can/will cause incorrect results.


Cheers,

Brendan

Re: Self hosting OS/Compiler from scratch

Posted: Mon Jan 27, 2014 8:15 am
by Kevin
Brendan wrote:You shouldn't have needed to compare them at all.
Yes, I could have reviewed each of them separately from scratch. It's not going to make your code look better, it will rather take more time.

I review a lot of code, and when you have similar pieces of code, my experience is that it's usually easier to compare against an already reviewed piece of code instead of doing a full review. So I tend to just do this automatically.
For the strings version, almost all the overhead is going to be in the code that compares unicode strings, so you wouldn't bother optimising that at all.
I'm pretty sure that even (or actually, even more so) for a slow string comparison, there is a significant difference between O(n²) and O(n log n) comparisons in the average case.
Maybe for the structures you figure out it's smarter to do an insertion sort when the array is created (so you don't need to sort it afterwards at all) or use some sort of tree instead of using arrays. For the integers, maybe you've got some huge arrays and need threads that nothing else needed plus some small arrays where the existing code is fine, and you end up with 2 completely different functions for different cases.
Yes, for different cases you'll use different data structures, an array here, a list there, a tree or a hash table in another case. But in the end, you hardly ever use something groundbreaking new. This is why a good standard library provides you with types for different data structures, so that you just need to pick the right one for your requirements, based on your high-level understanding of their properties, and don't have to care about the implementation details.

And I'm pretty sure there is more than one program that sorts an array, or that iterates a list, or that needs a red-black tree, each probably with different element types. If you can't make use of the same generic function twice in your Hello World application, that doesn't mean that no other program has a use for it.
Normally a programmer might (e.g.) be working on something involving strings and only care about the string sorting. Then 4 days later they might be doing something with integers and only care about integer sorting. A normal programmer doesn't suddenly need to look at all of them at the same time.
Right, and those four days later I remember that some sort() function already exists and I want to reuse it instead of remembering the details and implementing it a second time.
None of it is complex, it's just more lines of simple code (e.g. the only part that was a little tricky for you to understand was the "ums" stuff, which is only because you've never seen the language before).
What gets complex isn't the individual function. It can hardly be said that there is any difference in the difficulty to read a function that has u32 instead of T. The complex part is managing and maintaining the mess you have created. (Which of course requires that you actually do maintain code instead of throwing it away and starting from scratch. Perhaps that's the reason why you haven't made that experience yet.)
Great. Now a programmer that only cares about integer sorting has to wade through several different files of puke just to figure out where the code they care about went; and they can't easily compare the high level version with their assembly version.
Well, that's not much different from the programmer that only cares about search algorithms and has to wade through several different files of puke just to figure out where all the copies of the bubble sort code have gone that they want to replace with quicksort.

By the way, I'm still waiting for your example on this one:
Care to give an example? Assume we have another type BrendanStruct, how are you going to accidentally pass it to the wrong function or generate a new function, where a compile time error should have occured? Feel free to use non-Ada pseudo code as long as you keep the fundamental mode of operation intact.

Re: Self hosting OS/Compiler from scratch

Posted: Mon Jan 27, 2014 8:18 am
by Brendan
Hi,
Combuster wrote:
Brendan wrote:
Kevin wrote:You made me do additional work, and that is the complexity that is in code duplication.
Normally a programmer might (e.g.) be working on something involving strings and only care about the string sorting. Then 4 days later they might be doing something with integers and only care about integer sorting. A normal programmer doesn't suddenly need to look at all of them at the same time.
If there's any problem with the sorting, the obvious solution is to look at the others and see if they have been fixed or suffer from the same problem.

Strictly following your approach description, structure sorting wouldn't have gotten either fix, and consequently would give you bugs later.
If there's problems with the sorting, then your unit tests suck. Fix that first.

Of course I assumed the unit tests don't suck, the code does work, and you're either looking to see if it's what you want or optimising it for a certain case.


Cheers,

Brendan

Re: Self hosting OS/Compiler from scratch

Posted: Mon Jan 27, 2014 8:39 am
by Owen
Brendan wrote:
Owen wrote:Funny thing is, I could implement an assembly optimized version in C++ too (Modulo the lack of a standard inline assembly construct, so we will deal with GCC's for now):
Can you have 6 different versions in assembly language (e.g. one for 32-bit 80x86 without SSE, one for 32-bit 80x86 with SSE, a few for 64-bit 80x86, some for ARM, etc); and have all of them grouped next with the high level code so that it's obvious what the intended functionality is? Will the compiler be able to compile the high level version and one or more assembly versions and make sure they both function the same?
Yes, you can have a version for each (its' somewhat convoluted, but that's what happens when you're past the edge of the language standard. Of course, somebody could submit a proposal to the C++ standards committee to support this; you might even unify the syntax with that of the C++AMP extentsion which enables running a subset of C++ on the GPU. Even better, somebody could teach LLVM how to compile multiple versions automatically.)

How is the compiler supposed to understand the behavior of "std::sort" without omniscience? If your answer to that is "you write some tests"... well, you can write your tests in C++ as well. And then you could explicitly test those specializations.

The compiler can't automatically test all specializations because specializations are used for additional reasons on top of optimizing for a specific case
Brendan wrote:
Owen wrote:Of course, you probably wouldn't want such a specialized form of std::sort (the compiler will do the job better, and that particular specialization won't work with, say, std::vector iterators, though you could make it do so with a little work)
So now the template that was supposed to avoid code duplication ends up being 3 or more specialisations? Well, that certainly was useful...
Of course, the generic code can be used on an IOStream iterator to sort in place in a file (Maybe you're sorting more data than you have RAM? I agree the performance is liable to be suboptimal, but perhaps you're only ever going to run this code once, so don't really care all that much)
Brendan wrote:
Owen wrote:My sorting code is all centralized in one location (My implementation of std::sort), and my default comparison code is all centralized in appropriate locations (e.g. the definition of std::string)
Your integer handling code is spread everywhere (several templates for searching, sorting, etc, all in different places).
Sorting and Searching are higher order concepts than integers.

Therefore, sorting and searching code should be bunched together, not integer code (Fundamentally all code is integer code, so by your logic everything should be in one file named "integer")
Brendan wrote:If you decide that for strings you're doing a lot of lookups/searching and want to switch to a hash or something instead of an array (and end up wanting a merge sort for the rare cases that you do sort) then guess what, it's all spread everywhere. It's completely idiotic.
What? Now we are sorting hash tables?
Brendan wrote:
Owen wrote:Of course, C++ lets me do useful things like this as well:
Owen wrote:(That's an actual recreation of something I wrote for a 3D engine. Yes, recomputing the squares each time made performance sense. And the compiler even managed to vectorize it! No, reimplementing intro sort my self, or copying the code, didn't make sense)
And how long did it take you to learn C++, write any/all libraries you've used, write your C++ compiler twice from scratch, and implement the optimiser that did the vectorisation?

Did your compiler tell you there's a potential bug on every single line of that code? Hint: See if you can write a list of all the cases that overflows can/will cause incorrect results.
Bogus argument detected! "Implementing this is hard so its worthless to me?"

Actually, your overflow checking would have told me a bunch of lies (And hopefully you'd have a way to switch it off, so I can retain performance).

Cell coordinates fit in the range of a signed 16-bit integer. Subtracting the camera position will reduce that range further (it will be near to the cells being sorted). Multiplying two (effectively) 15-bit integers produces a 30-bit result. Adding those two integers produces a 31-bit result.

Of course, keeping those positions in 32-bit units helps by letting the compiler use padddq, pmuldq and phadd SSE instructions to perform this operation

Re: Self hosting OS/Compiler from scratch

Posted: Mon Jan 27, 2014 9:38 am
by Brendan
Hi,
Kevin wrote:
For the strings version, almost all the overhead is going to be in the code that compares unicode strings, so you wouldn't bother optimising that at all.
I'm pretty sure that even (or actually, even more so) for a slow string comparison, there is a significant difference between O(n²) and O(n log n) comparisons in the average case.
Except the main cost is normalising the strings, which only happens once per string regardless of how often you compare.
Kevin wrote:
Maybe for the structures you figure out it's smarter to do an insertion sort when the array is created (so you don't need to sort it afterwards at all) or use some sort of tree instead of using arrays. For the integers, maybe you've got some huge arrays and need threads that nothing else needed plus some small arrays where the existing code is fine, and you end up with 2 completely different functions for different cases.
Yes, for different cases you'll use different data structures, an array here, a list there, a tree or a hash table in another case. But in the end, you hardly ever use something groundbreaking new. This is why a good standard library provides you with types for different data structures, so that you just need to pick the right one for your requirements, based on your high-level understanding of their properties, and don't have to care about the implementation details.
Maybe, if you think "programming" means gluing together general purpose slop.
Kevin wrote:And I'm pretty sure there is more than one program that sorts an array, or that iterates a list, or that needs a red-black tree, each probably with different element types. If you can't make use of the same generic function twice in your Hello World application, that doesn't mean that no other program has a use for it.
Who cares? Cut, paste, then customise to suit your specific case just because you can.
Kevin wrote:
Normally a programmer might (e.g.) be working on something involving strings and only care about the string sorting. Then 4 days later they might be doing something with integers and only care about integer sorting. A normal programmer doesn't suddenly need to look at all of them at the same time.
Right, and those four days later I remember that some sort() function already exists and I want to reuse it instead of remembering the details and implementing it a second time.
Who cares? Cut, paste, then customise to suit your specific case just because you can.
Kevin wrote:
None of it is complex, it's just more lines of simple code (e.g. the only part that was a little tricky for you to understand was the "ums" stuff, which is only because you've never seen the language before).
What gets complex isn't the individual function. It can hardly be said that there is any difference in the difficulty to read a function that has u32 instead of T.
You can go and help Owen find all the potential overflow bugs in his code then.
Kevin wrote:The complex part is managing and maintaining the mess you have created. (Which of course requires that you actually do maintain code instead of throwing it away and starting from scratch. Perhaps that's the reason why you haven't made that experience yet.)
I've never had a problem maintaining my code. I only ever really start again from scratch due to fundamental changes in requirements (unless it's a research thing intended to be thrown away from the beginning).
Kevin wrote:
Great. Now a programmer that only cares about integer sorting has to wade through several different files of puke just to figure out where the code they care about went; and they can't easily compare the high level version with their assembly version.
Well, that's not much different from the programmer that only cares about search algorithms and has to wade through several different files of puke just to figure out where all the copies of the bubble sort code have gone that they want to replace with quicksort.
Does that ever happen for any application programmer (or any programmer, excluding people writing libraries)? Typically you're working on a specific area, like adding a "search for keyword" feature, or parsing data from a file, or figuring out some problem with some networking code; and never care about all code for searching all data types (or all code for all lists, or ...) at the same time.
Kevin wrote:By the way, I'm still waiting for your example on this one:
Care to give an example? Assume we have another type BrendanStruct, how are you going to accidentally pass it to the wrong function or generate a new function, where a compile time error should have occured? Feel free to use non-Ada pseudo code as long as you keep the fundamental mode of operation intact.
How is this not obvious?

If you've got a function that accepts "TestStruct" you get a compile-time error if you accidentally try to pass a "BrendanStruct". If you've got a generic/template (or macro) and you pass a "BrendanStruct" you get no compile-time error and the compiler generates a function for that type instead (assuming that the generic/template thing actually works for more than one data type).


Cheers,

Brendan

Re: Self hosting OS/Compiler from scratch

Posted: Mon Jan 27, 2014 11:08 am
by Jezze
I might be stupid but how are you suppose to check that each parameter to each function is within the desired value range? Each function makes a compare for each parameter before continuing with its body?

Re: Self hosting OS/Compiler from scratch

Posted: Mon Jan 27, 2014 3:29 pm
by Kevin
Brendan wrote:Except the main cost is normalising the strings, which only happens once per string regardless of how often you compare.
Even if you assume that the string aren't normalised yet, I highly doubt that normalisation dominates when you have O(n²) string comparisons.
Maybe, if you think "programming" means gluing together general purpose slop.
There will always be standard components that match the requirements, and other parts where no existing module is available. You're working efficiently when you don't waste your time reinventing the wheel, but use it for innovating and adding new stuff that doesn't yet exist.
Kevin wrote:Well, that's not much different from the programmer that only cares about search algorithms and has to wade through several different files of puke just to figure out where all the copies of the bubble sort code have gone that they want to replace with quicksort.
Does that ever happen for any application programmer (or any programmer, excluding people writing libraries)?
Why don't people working on libraries count? That's real code as well.
Typically you're working on a specific area, like adding a "search for keyword" feature, or parsing data from a file, or figuring out some problem with some networking code; and never care about all code for searching all data types (or all code for all lists, or ...) at the same time.
In practice, when I work on a feature, and I find out that the existing infrastructure isn't quite good enough, I don't add a specific hack for my specific feature, but I try to do it right, so that people working on other areas (perhaps myself a few weeks later) can take advantage of it, too.
If you've got a function that accepts "TestStruct" you get a compile-time error if you accidentally try to pass a "BrendanStruct". If you've got a generic/template (or macro) and you pass a "BrendanStruct" you get no compile-time error and the compiler generates a function for that type instead (assuming that the generic/template thing actually works for more than one data type).
Oh, really? I'm not sure to which of the functions I should pass the BrendanStruct array because it should be obvious that it works neither with the IntSort, nor the StringSort, nor the StructSort, and as you can probably infer from the existence of the other instantiations of the function, not with the generic sort either. I did it anyway for all of them (tell me if you have more creative ways of doing it that I should try out):

Code: Select all

    foo: BrendanArray := ((x => 6), (x => 2), (x => 7), (x => 6));
...
    IntegerSort(foo);
    StringSort(foo);
    StructSort(foo);
    sort(foo);

Code: Select all

test.adb:90:17: expected type "IntArray" defined at line 31
test.adb:90:17: found type "BrendanArray" defined at line 66
test.adb:91:16: expected type "StrArray" defined at line 38
test.adb:91:16: found type "BrendanArray" defined at line 66
test.adb:92:16: expected type "StructArray" defined at line 54
test.adb:92:16: found type "BrendanArray" defined at line 66
test.adb:93:05: must instantiate generic procedure "sort" before call