a programming language "kindof" designed for OS De

Brendan · Post by **Brendan** » Wed Aug 09, 2006 6:18 am

Hi,

Pype.Clicker wrote:that'd do the trick, no ?

That would give the illusion of doing the trick, without actually doing the trick...

I'm looking for a way to return mutliple values directly from a function, without wasting time stuffing them in memory (or on the stack) and then retrieving them from memory again later.

In assembly this is simple to do but I've not seen any high level language syntax that allows for it....

Cheers,

Brendan

Solar · Post by **Solar** » Wed Aug 09, 2006 7:00 am

Hmmm...

I didn't follow the whole discussion. But at the top, the talk is about "being like C", and at the bottom we're at PHP-like operator overloading...

Last time I looked, when you had to return from a function more than one value, the solution was to define a structure carrying all the return values, no?

Pype.Clicker · Post by **Pype.Clicker** » Wed Aug 09, 2006 8:30 am

Code: Select all

Brendan wrote: I'm looking for a way to return mutliple values directly from a function, without wasting time stuffing them in memory (or on the stack) and then retrieving them from memory again later.

In assembly this is simple to do but I've not seen any high level language syntax that allows for it....

well, not in languages that doesn't have a built-in list type, indeed (perl does it quite nicely, but okay, this is OT). IIrc, there was at least one C/C++ compiler (Watcom) allowing parameters to be passed through registers rather than through the stack ... i couldn't remind if it would be possible to have an 'output' parameter through a register, though.

I'd be curious to see if gcc can still optimize

Code: Select all

   void static square(uint x, uint* y) {
       *y=x*x;
   }

   register uint x;
   register uint y;
   square(x,&y);

to have y always kept in a register despite the "address of" operator (which is obviously here to allow "square" to return the value

)

And if it cannot, i'd be curious whether C++'s references (rather than pointers) ease that kind of optimization.

Of course, we're limited to "static" functions, defined in the same code chunk as their caller ...

Code: Select all

struct pos {
   unsigned x;
   unsigned y;
};

extern unsigned x,y; 

static struct pos tellwhere(void)
{
   return (struct pos) { x,y };
}

static struct pos further(struct pos where)
{
   return (struct pos) { where.x + 42, where.y -42};
}

compiles into

Code: Select all

  d8:   8b 0d 00 00 00 00       mov    0x0,%ecx
                        da: R_386_32    x
  de:   8b 15 00 00 00 00       mov    0x0,%edx
                        e0: R_386_32    y
   printf("%i",pos.x,pos.y,pos2.x,pos2.y);
  e4:   83 c4 14                add    $0x14,%esp
  e7:   8d 42 d6                lea    0xffffffd6(%edx),%eax
  ea:   50                      push   %eax
  eb:   8d 41 2a                lea    0x2a(%ecx),%eax
  ee:   50                      push   %eax
  ef:   52                      push   %edx
  f0:   51                      push   %ecx
  f1:   68 7c 00 00 00          push   $0x7c
                        f2: R_386_32    .rodata.str1.1
  f6:   e8 fc ff ff ff          call   f7 <main+0x67>

so i wouldn't worry too much about performance penalties, personnally ...

Solar · Post by **Solar** » Wed Aug 09, 2006 9:09 am

Brendan wrote: I'm looking for a way to return mutliple values directly from a function, without wasting time stuffing them in memory (or on the stack) and then retrieving them from memory again later.

Ah, there was the quote I was missing in this.

One, I'd daresay those nifty syntaxes of Perl / PHP do just that, and just hide it behind a "nicer" expression.

As for C / C++ / Java, the trick here is called "return value optimization". Consider:

Code: Select all

struct foo
{
    int foo1;
    int foo2;
};

struct foo bar()
{
    struct foo bar_rc;
    bar_rc.foo1 = 42;
    bar_rc.foo2 = 23;
    return bar_rc;
}

int main()
{
    struct foo rc;
    rc = bar();
    return 0;
}

A sufficiently smart compiler will recognize a pattern like this, and will not create the object bar_rc at all. Instead, the rc from the main() function is used directly. This is actually very common an optimization in today's compilers, I have been told.

Crazed123 · Post by **Crazed123** » Wed Aug 09, 2006 6:49 pm

And those compilers have to perform that optimization, because their languages don't support true multiple return-values.

To segue straight into my own thoughts, why not make an OSdeving language of a Lispier nature? Restrict the set of basic operators usable at runtime (ie: no lambdas or runtime defuns), and use Lisp's lack of syntax. Take out lexical closures, and add in strong typing so that memory allocation can be done C/C++/Object Pascal style. However, there should be an explicit variant type used to gain access to traditional dynamic typing behavior.

All of this would work with only a small memory-management runtime, and I imagine writing a compiler wouldn't be too hard, either.

proxy · Post by **proxy** » Thu Aug 10, 2006 8:56 am

And those compilers have to perform that optimization, because their languages don't support true multiple return-values.

I'm gonna have to disagree with this one

It's not a matter of "have to", it's just that like almost ALL languages, the literal conversion of lines in the language to machine code is not the most efficient way to do the job. Any language that "truly" supports multiple return values is probably just putting pretty syntax on what C/C++ does anyway, after all odds are, that compiler for your language was probably _written_ in C/C++

. So here's the thing.

First of all, I'm not quite sure what people are expecting with multiple return value, the way I see it, if the data to be returned is larger than a register, there are 3 options (there are perhaps more, but this is all that comes to mine at the moment):

(note intel specific examples, but concept applies to all cpus generally)

1. use several registers, not just EAX for the return value, in the example of a struct with 2 int elements, EAX:EDX would be sufficient, in fact Intel itself does this with a few 64-bit result opcodes (IMUL). The downside is that it only works for structures that are this small, you are potentially adding more registers to clean up (push/pop them to preserve original value), and there are a finite number of registers, so this technique only works up to about 24/28 bytes (esp clearly can't be used ebp maybe if not used as a frame pointer).

2. The obvious return a pointer. OK, well if the struct is bigger than 2 or so registers, odds are it's gonna be in RAM anyway till the end of it's scope, so why not just pass a pointer to it? It's already in RAM, a pointer fits in a register, and you'd need to read it from memory anyway, no huge loss, plus this works for all sizes. There is potentially a little memory cleanup with restoring ESP to it's original value, but odds are we had to do that anyway.

3. return direct to object. This is a nice trick, basically the compiler says, well i was gonna make this temp object, but i see you are just gonna copy the data and then this temp is gonna go away anyway, so why don't we eliminate the middle man. Once again works in the general case.

So my question is this:

In the languages that do "truly support" multiple return values, how do you intend to implement this feature on a assembly level? I mean, odds are you'll have to resort to one of these techniques mentioned above, in which case you are just putting a prettier syntax on top of C/C++'s approach. As mentioned above #1 is OK for this, but not general purpose, so it's only usable in some cases, would it be your solution anyway? If so, what would you want if the data is bigger than the registers can provide storage for?

proxy

Pype.Clicker · Post by **Pype.Clicker** » Thu Aug 10, 2006 9:26 am

Crazed123 wrote: why not make an OSdeving language of a Lispier nature? Restrict the set of basic operators usable at runtime (ie: no lambdas or runtime defuns), and use Lisp's lack of syntax. Take out lexical closures, and add in strong typing so that memory allocation can be done C/C++/Object Pascal style. However, there should be an explicit variant type used to gain access to traditional dynamic typing behavior.

Sure you could do it, and i guess it has been done already (at least in CS course by venusian teachers that love both LISP and UNIX). Now the _real_ question is what makes you chose a language when doing OS deving...

Syntax suiting your preferences ? (for how long? for how many people?)
Performance of generated code ?
Code readability and maintainability ?
Compiler reliability on producing correct code ?

It's commonly approved that you need to master a language/compiler/toolchain before you can actually use it in OS development. There will be no safety belt to catch your faults, after all. It's frustrating enough to count the hours spent in kernel bug hunt just because you misunderstood the command-line option for this or that tool (linker script or partcopy, anyone?) I don't think it'd be comfortable to use a freshly-developped language/compiler for OS deving ...

Let's say you take the gamble anyway, write the perfect language for kernel programming and the perfect kernel in that language ... so what.
Who's going to learn your language ? what else will it be good for ?
If people prefer writing their applications in C or in ASM or whatever, is your perfect kernel still so perfect ?

Now, i have to admit i once planned to design my own language for Clicker and finally decided that it wasn't worth the effort as soon as i could have a tool to do "dirty job" and write C code for me when i find C code more boring to write than a PERL script that generates it. That's how my "D" (before i learnt of Digital Mars compiler) language turned into CodeMonkey.

Candy · Post by **Candy** » Thu Aug 10, 2006 2:23 pm

For the multiple-value-pass mechanism I'd either recommend using the parameter passing in the inverse method, or adding an invisible return-parameter-object as last item (with c calling conventions for you pascal/windows people that can't conform). That way it won't have to be copied and if the calling function needs the object locally it can just use the stack-based one as local.

For AMD64-ish ABI's this would fold to two different options, since you could either put it in registers for small function signatures or force it to be on the stack in all cases. Both have a motivation and I think I'd choose the second.

Kemp · Post by **Kemp** » Thu Aug 10, 2006 2:56 pm

but I've not seen any high level language syntax that allows for it....

No-one ever thinks of Python :-[ Returning/assigning multiple values at once is a normal everyday thing in Python.

Crazed123 · Post by **Crazed123** » Thu Aug 10, 2006 10:16 pm

Kemp wrote:
but I've not seen any high level language syntax that allows for it....
No-one ever thinks of Python :-[ Returning/assigning multiple values at once is a normal everyday thing in Python.

Ditto in Common Lisp. CL supports returning multible values in the sense that a special form must be used to capture all but the first values. Otherwise, the rest get automatically discarded. I, for one, happen to like it that way.

Let's say you take the gamble anyway, write the perfect language for kernel programming and the perfect kernel in that language ... so what.
Who's going to learn your language ? what else will it be good for ?
If people prefer writing their applications in C or in ASM or whatever, is your perfect kernel still so perfect ?

Yes, since any language low-level enough to use well in OSdeving must, by necessity, have some ability to interface to code in other languages, especially C.

Solar · Post by **Solar** » Fri Aug 11, 2006 1:20 am

No offense intended, but neither of you two actually addressed the points proxy and Pype made:

is Python or Common Lisp, on the machine code level, really more effective in returning multiple values, than a C/C++ compiler doing return value optimization?
will your selfmade-compiler for your selfmade-language actually beat the average GCC for efficiency of generated code, correctness, and ease of availability (not only the binaries, but documentation, tutorials, help from others)?
will the improvements brought about by your language actually offset the learning time required to understand your language and toolchain?

I know, some of this could be countered with the truism, "if we all thought like that nothing new would ever be made". But I believe in something that, in a company, would be called a "business case": Is the goal really worth the effort, or are you bound to end up with yet another toy language, supported by an unfinished toolchain rotting on a homepage that sees about three hits a month?

A great many people are pretty content with C, binutils/NASM, and make. If you want them to consider your langage worthwhile, it must be much better suited to the task for them to go through an all-new language they could use nowhere else. (That last point is what would keep me away from it. I see my hobby projects as much as a chance to improve my business skill portfolio as it is about fun.)

Kemp · Post by **Kemp** » Fri Aug 11, 2006 4:56 am

Actually I wasn't trying to answer that, I was just slow answering the earlier point that there apparently weren't any high level languages that supported returning multiple values. Unless I read the original post wrong of course, and was accidently trying to answer those points...

Solar · Post by **Solar** » Fri Aug 11, 2006 5:55 am

Ah, OK. But still, proxy nailed it: It doesn't matter much for efficiency how your language makes it look like, it matters what your compiler makes of it, and you can't beat return value optimization for efficiency.

Crazed123 · Post by **Crazed123** » Fri Aug 11, 2006 7:08 pm

Solar wrote: No offense intended, but neither of you two actually addressed the points proxy and Pype made:

is Python or Common Lisp, on the machine code level, really more effective in returning multiple values, than a C/C++ compiler doing return value optimization?

will your selfmade-compiler for your selfmade-language actually beat the average GCC for efficiency of generated code, correctness, and ease of availability (not only the binaries, but documentation, tutorials, help from others)?

will the improvements brought about by your language actually offset the learning time required to understand your language and toolchain?
I know, some of this could be countered with the truism, "if we all thought like that nothing new would ever be made". But I believe in something that, in a company, would be called a "business case": Is the goal really worth the effort, or are you bound to end up with yet another toy language, supported by an unfinished toolchain rotting on a homepage that sees about three hits a month?

A great many people are pretty content with C, binutils/NASM, and make. If you want them to consider your langage worthwhile, it must be much better suited to the task for them to go through an all-new language they could use nowhere else. (That last point is what would keep me away from it. I see my hobby projects as much as a chance to improve my business skill portfolio as it is about fun.)

1.A Common Lisp compiler can always know, just by looking at the form receiving the return values, whether or not those values are captured. Hence, a good Common Lisp compiler always performs what you call return value optimization: it always detects where those values are going and passes them directly there rather than actually writing out assembler code for a "multiple-value-list" form.
2.Probably not, which is one of the reasons I would write my compiler to compile down to C or Pascal rather than ASM.
3.Have you actually tried Lisp?

If you only want to use languages supported by the business community, go right on ahead using C. I'll run Object Pascal, and we'll see how things work out.

Cheery · Post by **Cheery** » Sat Aug 12, 2006 12:58 am

Actually, the experience I've had from lisp-c compilers, It is not at all best solution. It does the wrong decision between performance and usability.

with lisp-c compiler you have a lot harder to handle errors because there is involved the stupid C compiler&linker, and it is separate from your lisp -stuff.

I prefer the lisp straight to runnable code -solution.

OSDev.org

a programming language "kindof" designed for OS De

Re:a programming language "kindof" designed for OS

Re:a programming language "kindof" designed for OS

Re:a programming language "kindof" designed for OS

Re:a programming language "kindof" designed for OS

Re:a programming language "kindof" designed for OS

Re:a programming language "kindof" designed for OS

Re:a programming language "kindof" designed for OS

Re:a programming language "kindof" designed for OS

Re:a programming language "kindof" designed for OS

Re:a programming language "kindof" designed for OS

Re:a programming language "kindof" designed for OS

Re:a programming language "kindof" designed for OS

Re:a programming language "kindof" designed for OS

Re:a programming language "kindof" designed for OS

Re:a programming language "kindof" designed for OS