OSDev.org

Posted: **Tue Nov 16, 2010 11:24 pm**

NickJohnson wrote:Ack! You continue to speak vaguely and naively. I suggest you do some experimenting and find how well you are grounded in reality.

How's it vague? Programmers think in "thought" (not quite language, but very similar), and design their programs in that form. They then translate their programs, often bit by bit as they think them up, into some other form (such as C++, or machine code in my case). Likewise, A.I. isn't going to think in any programming language or machine code, but rather in "thought". Programs are designed in thought, then translated into some form associated with computers, but there's no reason to imagine that an intermediate step between thought and machine code is going to be the best way to do things. It quite simply isn't. The only compiler A.I. is going to need is it's own intelligence and knowledge which it will use to convert programs from thought directly into to machine code, and wherever it lacks knowledge, it will do tests until it has the required knowledge. I will try to do the same, but can't be sure that I will succeed on my own in the time available given that most of my effort has to be focused on developing A.I. - code optimisation is a side issue which I simply don't need to worry about in the long run, though I will certainly do some work on it along the way for the fun of it and because it's interesting, and maybe it'll surprise everyone by being as easy as I hope it might be. Your programming languages and compilers are a nightmare of unnecessary complexity, and I have absolutely no desire to follow you down that route.

Posted: **Tue Nov 16, 2010 11:35 pm**

I'd like to know how much profiling you've done before you jump into "optimising" your "code".

For reference, the (generally, and generalised) standard routine for optimisation is as follows. This applies whether or not you're writing in an interpreted language, a compiled language, or pure machine code.

Compile with your choice of optimisation flags.
Profile to determine slow areas of the program.
Perform optimisation on these areas. This could be anything from switching to hand-written assembly to rethinking the algorithm itself - eg, wrong sort algorithm, or switching to a binary tree instead of a linear list.
Repeat until performance is profiling shows adequate in your most-traversed code paths.

We got someone to handwrite our memcpy() function in (mostly) assembly after profiling showed it was a severe bottleneck. Other functions haven't been given the same treatment because GCC's -O3 generated code is easily fast enough for our purposes and they have not yet proven to be a bottleneck: the average -O3 output from GCC is that it's fast enough for most code*.

That phrase "fast enough" is the key point of my post. One of the hardest things to do is to look at an algorithm or piece of code and say that it's fast enough. "It could be faster," we say, "but it's fast enough for now. We have bigger problems to solve first."

This is because for us it's not about squeezing every little bit of performance out of the system (once you do any form of I/O or perhaps get hit by an IRQ a lot of your work is trashed anyway) but about being able to maintain several hundred thousand lines of code without going insane.

Without profiling or even a point of comparison for your code (in your case, an equivalent algorithm, output from a compiler that has been told to optimise heavily), you will always be in the dark when attempting to optimise. How are you supposed to know what needs work and what is in fact "fast enough" to run on the average CPU?

Also,

Programmers think in "thought" (not quite language, but very similar), and design their programs in that form.

Correct. However, at this point in time I only have around 15-30 files worth of program in my head to quickly refer to when writing code. Anything out of the other two thousand or so source and header files really needs to come from a quick lookup of the source code. Your idea works great for small projects, but think about what happens when the project gets big.

One of our output binaries is 750 KB - you can't possibly expect to maintain 750,000 bytes of machine code instructions (realistically).

*I should note for those around that love technicalities that GCC's -O3 generation isn't always sensible. I don't want to sound like I'm promoting GCC as the "be all and end all" solution. It's not always readable (not that that matters), for some input it can underperform compared to hand-written assembly (note "some input" - there are some constructs programmers can write that will result in GCC generating awful output), and every now and then GCC emits buggy code - I haven't seen this behaviour for a very long time though, and usually the buggy emission comes from buggy input anyway.

Posted: **Wed Nov 17, 2010 12:28 am**

DavidCooper wrote:
NickJohnson wrote:Ack! You continue to speak vaguely and naively. I suggest you do some experimenting and find how well you are grounded in reality.
How's it vague? Programmers think in "thought" (not quite language, but very similar), and design their programs in that form. They then translate their programs, often bit by bit as they think them up, into some other form (such as C++, or machine code in my case). Likewise, A.I. isn't going to think in any programming language or machine code, but rather in "thought". Programs are designed in thought, then translated into some form associated with computers, but there's no reason to imagine that an intermediate step between thought and machine code is going to be the best way to do things. It quite simply isn't. The only compiler A.I. is going to need is it's own intelligence and knowledge which it will use to convert programs from thought directly into to machine code, and wherever it lacks knowledge, it will do tests until it has the required knowledge. I will try to do the same, but can't be sure that I will succeed on my own in the time available given that most of my effort has to be focused on developing A.I. - code optimisation is a side issue which I simply don't need to worry about in the long run, though I will certainly do some work on it along the way for the fun of it and because it's interesting, and maybe it'll surprise everyone by being as easy as I hope it might be. Your programming languages and compilers are a nightmare of unnecessary complexity, and I have absolutely no desire to follow you down that route.

Not true. Programs are not designed in thought. They are designed on white-boards and on paper using well-developed design techniques. These are different depending on the task. Paper designs are very important because the programmers use these to review each others designs and to record their designs for others to see and maintain.

Furthermore, programs are not developed bit-by-bit. They are designed from requirements and developed in top down fashion. The (well, one of the) intermediate step (HLL) between thought and machine code that you say is unimportant is actually crucial to development. Code must be reviewed. To be reviewed it must exist in a human readable form. It also must be unit tested. I wonder whether you have ever designed a set of unit tests for a program written in machine code. It's difficult enough to unit test assembly language.

Your programming system may be suitable for programming small systems but I wonder whether you really understand the complexity of designing and building a large system.

What about some evidence, any evidence at all that any of the things you tell us is true or even theoretically feasable.

Posted: **Wed Nov 17, 2010 2:13 am**

gerryg400 wrote:What about some evidence, any evidence at all that any of the things you tell us is true or even theoretically feasable.

DavidCooper wrote:I've made no secret of the fact that I've only just started learning to program in C++ and that I've no experience in programming in assembler or any other serious programming language

I think that answers your question.

@DavidCooper:
While we all like to dream, and some people have big fantasies, but if you want to realize things you will at some point need to know some facts and insights. You won't be getting them from a 9-page discussion, but through countless hours of practice and research.

I think this is a good moment to stop this discussion as the people here can keep it going for ages, but for as long as you don't get past the basics we will keep misunderstanding each other. A lot to learn, you have. But if you don't practice you won't be getting anywhere.

Posted: **Wed Nov 17, 2010 5:21 pm**

Hi pcmattman,

pcmattman wrote:I'd like to know how much profiling you've done before you jump into "optimising" your "code".

I plan every program in detail before writing a byte of code, making sure that the algorithms used in it are the fastest I can find and that the data formats are optimised for rapid access within the constraints of any need to compress the content. Before I started on my linguistics module I spent over a year working on data formats alone because that was crucial to the design - the machine will fill up with data very quickly when it starts reading through the Web, so the more that can fit in memory for rapid reading and the more that can fit on the hard disk(s), the better it will function and the fewer machines will be required to handle the volume.

For reference, the (generally, and generalised) standard routine for optimisation is as follows. This applies whether or not you're writing in an interpreted language, a compiled language, or pure machine code.

Compile with your choice of optimisation flags.

How do you do that with pure machine code? There aren't any optimisation flags. The rest are correct though.

... the average -O3 output from GCC is that it's fast enough for most code*.

That phrase "fast enough" is the key point of my post.

I agree that the output of any good compiler is going to be fast enough most of the time for whatever it's to do. The argument is about a tiny performance edge which could be gained for free by using an alternative programming method which borrows from the automated code-optimisation of compilers while programming directly in machine code, and in the long run that is exactly how A.I. will do its programming (although for Web use it will doubtless also program directly in JavaScript, Flash and whatever else may be of value in that environment). It will also be capable of programming in C++ or with any other compiler, but the best performance will come from the programs it writes directly in machine code (or rather translates directly into machine code from thought), and that's inevitable given that the compiler put restrictions on what can be done in its machine code output.

This is because for us it's not about squeezing every little bit of performance out of the system (once you do any form of I/O or perhaps get hit by an IRQ a lot of your work is trashed anyway) but about being able to maintain several hundred thousand lines of code without going insane.

The A.I. won't go insane or even be troubled by the size of programs of any size. In any case, if a large system is designed well it's just a set of many virtually independent programs which interact reliably in simple ways with each other and which a human should be able to keep on top of.

Without profiling or even a point of comparison for your code (in your case, an equivalent algorithm, output from a compiler that has been told to optimise heavily), you will always be in the dark when attempting to optimise. How are you supposed to know what needs work and what is in fact "fast enough" to run on the average CPU?

When I'm not sure about the best algorithm for something, I often write that part of the program in more than one way just so that I can compare them for speed. I don't think there's any shortcut to doing that. I can also look at the speed of a piece of code and unroll lots of loops if necessary to speed it up at the cost of longer code. All the stuff I've written could be regarded as "fast enough" on any processor, but the faster it runs, the better other programs can run in the background. It's a compromise, but whether it's a compromise or you're going for out-and-out speed at the cost of all else, the same original algorithm will run either as fast or faster if it's written in machine code and speed optimised than if it's compiled and speed optimised, provided that the speed optimisation is equally good at its job for both systems, which in principle it realistically could be. The speed difference may be tiny in most cases and therefore of very little importance, but there will be some occasions where a tiny speed difference is crucial for something.

Programmers think in "thought" (not quite language, but very similar), and design their programs in that form.
Correct. However, at this point in time I only have around 15-30 files worth of program in my head to quickly refer to when writing code. Anything out of the other two thousand or so source and header files really needs to come from a quick lookup of the source code. Your idea works great for small projects, but think about what happens when the project gets big.

I write everything up in colour-coded text files so that I can easily check anything I've written in the past. The real program is the algorithm described in normal langage, but along with it is the implementation of it in machine code. A.I. will be able to look up anything it's done in the past too, though it should be able to work out the best solution from scratch every time too if it has to, and maybe it should as it isn't necessarily going to take any longer.

One of our output binaries is 750 KB - you can't possibly expect to maintain 750,000 bytes of machine code instructions (realistically).

Why not? Is that program a mess or is it well structured? All my code in total doesn't come to half of that, but it's still a lot and it all ties in together as a single system which is really easy to work with. If you indexed your machine code and used a machine editor I think you'd be surprised at just how easy it is to handle.

Hi Gerry,

gerryg400 wrote:Not true. Programs are not designed in thought. They are designed on white-boards and on paper using well-developed design techniques. These are different depending on the task. Paper designs are very important because the programmers use these to review each others designs and to record their designs for others to see and maintain.

Are you telling me that the programs are written by the white-boards and paper rather than by programmers doing it in their heads? Writing stuff out helps you to organise your thoughts and to share them clearly with other people, and although I don't work with anyone else, I do write a lot of stuff out and draw diagrams to help work out how to handle the complexity. I compartmentalise things into individual tasks and try to write them in such a way as to make them universal dll routines so that they can be used by future programs as well if it's sensible to do so - this means that once something has been written to run on my OS, any other program can tap directly into that same code (which has to be re-entrant), the idea being to eliminate duplicate functionality on the machine and potentially to allow you to keep all your programs loaded in memory and ready for immediate action all the time.

Furthermore, programs are not developed bit-by-bit. They are designed from requirements and developed in top down fashion. The (well, one of the) intermediate step (HLL) between thought and machine code that you say is unimportant is actually crucial to development. Code must be reviewed. To be reviewed it must exist in a human readable form. It also must be unit tested. I wonder whether you have ever designed a set of unit tests for a program written in machine code. It's difficult enough to unit test assembly language.

When I said bit-by-bit, I intended you to understand that as little-by-little rather than 1/8 of a byte by 1/8 of a byte. Yes, programs are designed top down, starting in thought and then being translated into code of some kind. If A.I. is doing the work, there should be no errors to test for and nothing needing to be reviewed. When I do the job, I can review my program in the version written in normal language and redesign parts of it as necessary, but the normal language version is always so detailed that it looks very close to being a programming language anyway: you might call it pseudocode, though it does go into details related to low-level methods. I'm not clear as to what you mean by a unit, but I borrowed a lot of ideas from things I read about modular and object-oriented programming and I apply that to my own programs in such a way that almost everything is a stand-alone program which can be tested to make sure it works as intended (though I don't usually bother to test them as they normally work fine anyway and it would be a waste of effort to write a test program every time), and when they're all stuck together and have to work as a compound program, often as not they work straight out of the box. When they don't, it can sometimes take a bit of hunting to find the bug, but I can stick calls to my monitor program into any part of any piece of code to switch it to be interpreted from that point on, switching back to direct running again as soon as it hits a 144, 145, 145, 144. I can usually find the cause of the bug in five to ten minutes, though very rarely it may take an hour or more. The monitor program allows me to check just the points where different parts of the program interact to see if there's a fault there, and that never fails to show up which component contains the bug.

Your programming system may be suitable for programming small systems but I wonder whether you really understand the complexity of designing and building a large system.

If a large system is designed properly, it is nothing more than a set of small systems tied together by another small system. The complexity for the programmer should not keep going up if you stick to that and avoid creating a knotted mess.

What about some evidence, any evidence at all that any of the things you tell us is true or even theoretically feasable.

If you're prepared to risk running my OS on one of your machines and if you can find a way to get it onto a floppy disk via the Web, it should be possible, though there are a number of untidy bits which I'd like to sort out first. At the moment it only boots from an internal floppy drive because it switches to 32-bit mode in the boot sector and then runs an FDC device driver from the boot sector to load the first module of the OS. It also needs to be tuned to the speed of the processor so that it doesn't damage the FDC chip by polling it too aggressively - I burned out a number of ports on one machine before I realised this was possible. Once the interrupt routines are set up, it's fine on any machine, but I need to do a redesign of the loading system using the BIOS for the first module.

Hi Combuster,

Combuster wrote:@DavidCooper:
While we all like to dream, and some people have big fantasies, but if you want to realize things you will at some point need to know some facts and insights. You won't be getting them from a 9-page discussion, but through countless hours of practice and research.

I have sufficient expertise to do what I've told you I'm going to do - I don't need to be an expert in your ways of doing things as well, though there is indeed a lot there that I can learn and borrow from.

I think this is a good moment to stop this discussion as the people here can keep it going for ages, but for as long as you don't get past the basics we will keep misunderstanding each other. A lot to learn, you have. But if you don't practice you won't be getting anywhere.

I've noticed that it's gone on rather a long time, and it's reaching the point of becoming a little obsessive. I think we should all get out of this and concentrate on our work. I can assure you that I am getting somewhere, and I will keep driving in that direction because it is clear to me that it will become the mainstream path in the future. In the meantime, compilers have their place and have undoubted advantages over my system as it stands.

Hi Berkus,

berkus wrote:I personally gave up at the point of 5 bit chars, there goes your speed optimization, down the drain.

That particular example speeded something up which depended on using compressed data. If you were to write code to handle compressed sound or video files, would you just give up on speed optimisation on the basis that the compression and decompression processes slow things down so much that you shouldn't bother? I don't think so: the speed becomes even more important when you're working with comressed data.

Please, David, try it out, keep us updated on your progress and we'll see how well it goes and where...

Don't worry - I'll let you know as soon as each step's ready to be seen. My priority (from the point of view of making it available to you) will be to get my OS safe to run on other people's expensive equipment, because at the moment I only run it on old machines in case it breaks something, and most of the time I've been working on 486s, though I recently acquired an old Pentium laptop with an internal floppy drive so that I can play with the FPU. My menu system is also a mess which needs to be rewritten as it stops all multitasking on the machine whenever it displays a message, but I've designed a replacement universal menu program which calls the OS on every lap (to enable a task switch - I use co-operative multitasking for the main threads of programs) and which is re-entrant such that it can call itself and be called by any number of processors and programs at the same time. I'd like to get that in place before I make my OS available. There's a lot else that needs a rewrite in addition, but my focus is on the linguistics work, so creating a perfect OS up front has never been the priority, though I'm having to think a long way ahead in order to make sure that future additions and changes won't conflict with existing parts of the design. I'm redesigning a lot of things to be able to work in a multiprocessing environment, though I haven't actually got to the point of writing code to fire up a second processor - there's no point at the moment as I don't have a machine with a second processor to try it out in. What I do have, however, is a system which makes machine code programming of large programs easy and fun, while at the same time giving you a great view of the inside of the machine (you can watch stacks growing and shrinking as a program works away concurrently with the machine editor which allows you to watch it, and if it's working with a lot of data for an extended length of time, you can watch the data change as the app runs). You can look through the memory of the machine while it's actively doing things and see all the indexing of all the code and variables. You can modify code while it's running, including the interrupt routines if you're careful. It's simply a hacker's paradise (and I assume that everyone here understands the real meaning of the H word). It also has the potential to do everything I have claimed of it in this thread, and it's just a matter of time before all those claims are realised.

The original point of this thread was to discuss a common app format which might have been in place already or which could be created by those who like the idea of app compatibility for a group of hobby OSes - I wanted to discuss this before making any changes to my OS which might have got in the way of that. Perhaps now it should return to that purpose and go into hibernation untill someone new arrives with the same idea who might want to wake it up again. As for the off-topic arguments, no one has won or lost here, and a lot of what's been said is simply at cross purposes. If anyone wants to have the last word on anything in the meantime, feel free, but please try to do so in a way that doesn't push for a response from me.

Posted: **Wed Nov 17, 2010 11:29 pm**

DavidCooper wrote:If anyone wants to have the last word on anything in the meantime, feel free, but please try to do so in a way that doesn't push for a response from me.

Translit., anybody can try to have the last word, but only if I don't feel like having it myself.

Can we close this?

Posted: **Thu Nov 18, 2010 12:26 pm**

This thread has played out long enough, and it's painfully obvious to everyone where it's going. I've locked it so we can finally move on to something else.

OSDev.org

Can you run your apps on each other's operating systems?

Re: Can you run your apps on each other's operating systems?

Re: Can you run your apps on each other's operating systems?

Re: Can you run your apps on each other's operating systems?

Re: Can you run your apps on each other's operating systems?

Re: Can you run your apps on each other's operating systems?

Re: Can you run your apps on each other's operating systems?

Re: Can you run your apps on each other's operating systems?