An incorrect information in the Wiki

All about the OSDev Wiki. Discussions about the organization and general structure of articles and how to use the wiki. Request changes here if you don't know how to use the wiki.
Post Reply
Clover5411
Member
Member
Posts: 25
Joined: Sat Jul 27, 2019 9:41 am

An incorrect information in the Wiki

Post by Clover5411 »

Well, I'm not sure if this is the right place for this, but basically there is an incorrect information in the wiki. It's in the page "Tail Recursion and Tail Call Optimization" (https://wiki.osdev.org/Tail_Recursion_a ... timization)

In this page, it's written that a tail-call is when a call occurs in the end of a function. This is incorrect, a tail-call is when, in assembly, a CALL occurs before RET.

Code: Select all

 unsigned long long Factorial(unsigned start) {
    if (start > 1) {
       return Factorial(start - 1) * start;
    } // if
    return 1;
 } // Factorial(start)
Now, according to the page, there is no tail-call here (which is correct, but for a very different reason. I'll get back to this)

Now the page also writes:

Code: Select all

unsigned long long Factorial(unsigned start) {
    if (start <= 1) {
       return 1;
    } // if
    return start * Factorial(start - 1);
 } // Factorial(start)
The page claims there is a tail-call here, which is incorrect, there is none. As I said a tail-call is when a CALL occurs before RET. I'll just use pseudo-assembly for simplicity (since CALL, RET and JMP are more or less the same in most architectures it doesn't matter much):

Code: Select all

CMP start,1
JMP.HI _endofblock
MOV return_value, 1
RET
_endofblock:
SUB tmp, start, 1
PUSH tmp
CALL Factorial
MUL return_value, return_value, start
RET
As you can see there is a MUL between CALL and RET. As a result, there is no tail-call. The same logic also applies to the above code, as that code will also result in a MUL between CALL and RET.

Now an example of a tail-call would include:

Code: Select all

function foo(data) {
    a(data);
    return b(data);
}
In assembly, this is:

Code: Select all

PUSH data
CALL a
PUSH data #assuming this is needed
CALL b
RET
Now this can be tail-call optimized, by combining the CALL and RET to simply JMP.

Code: Select all

PUSH data
CALL a
PUSH data #assuming this is needed
JMP b
The new code will behave in the exact same way, with a minor and desirable difference, it won't push IP/PC to the stack, which is a great thing if the said tail-call is also a recursive call (though it isn't in this case)

Contrary to what the said page claims, a tail call doesn't need to be at the tail of the code:

Code: Select all

function bar(data) {
    if ( a(data) ) {
        return b(data);
    }
    return c(data);
}
In assembly:

Code: Select all

PUSH data
CALL a
TEST return_value
JMP.FALSE _endofblock
PUSH data
CALL b
RET
_endofblock:
PUSH data
CALL c
RET
Here, both b and c are tail-calls even though only c is at the tail, because both result in a case where CALL and RET follow each other, and can be optimized to:

Code: Select all

PUSH data
CALL a
TEST return_value
JMP.FALSE _endofblock
PUSH data
JMP b
_endofblock:
PUSH data
JMP c
Another example, where the call isn't at the tail, but nonetheless is a tail-call:

Code: Select all

function foo()
{
       int myInteger = bar();
       return myInteger;
}
In assembly:

Code: Select all

CALL bar
MOV myInteger, return_value
MOV return_value, myInteger
RET
We can first optimize the unnecessary MOVs, getting rid of myInteger:

Code: Select all

CALL bar
RET
And then simply apply tail-call optimization:

Code: Select all

JMP bar
On the contrary, even if the statement is at the tail, it may not be a tail-call, and the code in the said page is a great example for that:

Code: Select all

unsigned long long Factorial(unsigned start) {
    if (start <= 1) {
       return 1;
    } // if
    return start * Factorial(start - 1);
 } // Factorial(start)

Code: Select all

CMP start,1
JMP.HI _endofblock
MOV return_value, 1
RET
_endofblock:
SUB tmp, start, 1
PUSH tmp
CALL Factorial
MUL return_value, return_value, start
RET
As you can see, this code results in a MUL between CALL and RET, therefore there is no tail-call, even though the call is actually on the tail of the function.

Now of course, this also depends on the exact architecture, but as I said, the way the instructions CALL, RET and JMP work are usually the same, so most of the above examples with pseudo-assembly would still work on most real architectures.

If you, the person reading this, are a moderator or anyone else with the privilege to modify the wiki, I ask of you to correct this page, please. Thanks in advance.

Source: https://en.wikipedia.org/wiki/Tail_call
Last edited by Clover5411 on Sat Jul 27, 2019 11:22 am, edited 1 time in total.
User avatar
xenos
Member
Member
Posts: 1121
Joined: Thu Aug 11, 2005 11:00 pm
Libera.chat IRC: xenos1984
Location: Tartu, Estonia
Contact:

Re: An incorrect information in the Wiki

Post by xenos »

Actually everybody can edit the wiki, so also you could do it if you want to. To get edit rights, you just need to go to the User Control Panel here in the forum, then to the tab User Groups, and join the wiki group.

Of course, you're right - nicely spotted!
Programmers' Hardware Database // GitHub user: xenos1984; OS project: NOS
Clover5411
Member
Member
Posts: 25
Joined: Sat Jul 27, 2019 9:41 am

Re: An incorrect information in the Wiki

Post by Clover5411 »

Oh, didn't know everyone could do it. Thanks. :D
User avatar
Schol-R-LEA
Member
Member
Posts: 1925
Joined: Fri Oct 27, 2006 9:42 am
Location: Athens, GA, USA

Re: An incorrect information in the Wiki

Post by Schol-R-LEA »

I wasn't even aware that there was a page on this here. According to the history, it was posted in January of this year, and I am guessing no one except the OP (Johnburger) had noticed it until now.

FlandreScarlet: You're right, the example as given is not a tail call at all. The OP seems to think that the position of the call on the line of source code is what is relevant, which is not at all the case - it has to do with it being the last action in the generated code before the function exits and returns. The real relevant factor is whether the activation record (or stack frame, or local environment - take your pick, those terms all amount to almost the same thing) on the call stack can be reused without losing any necessary information, and in this case, the answer is no.

There is a related optimization which is referred to as a 'tail recursion modulo cons' which could be applied here, but it is significantly more complex to for the compiler writers to implement. While a discussion of it might be appropriate, as it is the example is entirely incorrect.

The funny thing is, in Scheme textbooks (which I am guessing where Johnburger got this, as it seems like a garbled version of a common example), the reverse is usually given, replacing this linear recursion:

Code: Select all

; yes, I am aware that factorial is only defined for positive integers, 
; but I wanted to keep it simple
(define (factorial-1 n)
  (if (<= n 1)
    1
     (* n (factorial-1 (- n 1)))))
with this 'linear iteration' (i.e., a recursion suitable for TCO, which in Scheme terms is considered iteration because it gets optimized into one):

Code: Select all

; note that the 'named let' basically creates a special-purpose internal function;
; it's that function 'fact' which is recursing in this case.
(define (factorial-2 n)
  (let fact ((product 1)
               (counter 1))
    (if (> counter n)
        product
        (fact (* product counter) (+ 1 counter))))) 
However, this approach to iteration isn't necessary, or even particularly applicable, in languages which have built-in iterative operators such as while() and for() (or even a standard iteration macro, such as Common Lisp's (loop)). In fact, even Scheme has one, (do), though it's rather odd:

Code: Select all

(define (factorial-3 n)
  (do ((counter 1 (+ 1 counter))
       (product 1 (* product counter)))
      ((> counter n) product)
  ;; note that there is no loop body in this case,
  ;; as the iteration clauses do all the heavy lifting
  )) 
In any case, actual idiomatic Scheme would really be do this (she said, tongue firmly in cheek) with an accumulator applied to a stream (a lazy list). Assuming one got that far in learning the language, that is. Note that standardization of SRFI library names between implementations is... problematic.

Code: Select all

#!r6rs
; at this point I have probably lost everyone anyway, so a detailed explanation is probably fruitless.
; I will mention that I've tested this in Guile, so at least one implementation can use it...
(import (srfi :41))

(define (factorial-4 n)
  (if (<= n 0)
      1
      (let ((range (stream-range 1 (+ 1 n))))
        (stream-fold
         (lambda (x y)
           (* x y))
         (stream-car range) (stream-cdr range)))))
Enough of that, I think.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Clover5411
Member
Member
Posts: 25
Joined: Sat Jul 27, 2019 9:41 am

Re: An incorrect information in the Wiki

Post by Clover5411 »

Interesting. Thank you for putting this information here.

Although factorial isn't strictly defined only for positive integers.

Usually, it is also defined for zero where:

0! = 1

There is also more generalized definition with the gamma function, which can also be applied to non-integers.

So it depends on the exact definition for factorial you are using.

Not really an important matter, but nerd instincts kicked in. :mrgreen:
linguofreak
Member
Member
Posts: 510
Joined: Wed Mar 09, 2011 3:55 am

Re: An incorrect information in the Wiki

Post by linguofreak »

FlandreScarlet wrote:Interesting. Thank you for putting this information here.

Although factorial isn't strictly defined only for positive integers.

Usually, it is also defined for zero where:

0! = 1

There is also more generalized definition with the gamma function, which can also be applied to non-integers.

So it depends on the exact definition for factorial you are using.

Not really an important matter, but nerd instincts kicked in. :mrgreen:
The really interesting thing is that if you use the definition factorial=gamma(x+1), the only complex numbers for which factorial is not defined are real integers (specifically, the negative integers).

So strictly speaking, factorial is defined for other numbers than the natural numbers over any domain that is a superset of the real integers, but over the real integers is defined only for the naturals.
Clover5411
Member
Member
Posts: 25
Joined: Sat Jul 27, 2019 9:41 am

Re: An incorrect information in the Wiki

Post by Clover5411 »

Okay, so this took me way longer than it should have. The reason being a mixture of laziness and fearing my own ineptitude. That being said I finally had both the motivation and confidence to put up a rudimentary explanation of tail call optimization. Unfortunately, I couldn't get into details about prologues and epilogues, and how this would effect optimization. I do have some idea on how it could work, but I don't want to add misinformation.

I suppose I should also ask, can I even delete stuff from the main page? I don't know how to do that, so I took what felt like the most reasonable action at the time; but in retrospect, it was a bit childish. (EDIT: Nevermind, it was surprisingly simple... Truth be told, temptation to delete that page and this topic is strong...)
User avatar
SeaLiteral
Posts: 19
Joined: Wed Sep 27, 2017 1:44 pm

Re: An incorrect information in the Wiki

Post by SeaLiteral »

KineticManiac wrote:Okay, so this took me way longer than it should have. The reason being a mixture of laziness and fearing my own ineptitude.
I haven't had the confidence to change much on the wiki myself either. There's even this "if someone writes something and I make an edit to it, how will the person that initially wrote the text feel?" feeling that I probably shouldn't have that much of but I do. I feel like I know of several mistakes in the interrupts tutorial, but even though I see them as mistakes (or in one case, just doing a thing without properly explaining why you're doing it that way) I don't really feel like I have the confidence to change what someone else wrote.
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Re: An incorrect information in the Wiki

Post by Solar »

Do, do, do edit! That's the lifeblood of a Wiki.

Over the years, many people who have written a lot in the Wiki drop out of the hobby, partially or completely. If you hold whatever they wrote sacred, sooner or later the Wiki will wither and die.

Those who wrote into the Wiki before you might have been before you, but that doesn't make them Elders. A Wiki is a cooperation thing.

In the beginning, it was one static HTML page with a couple of good hints. It grew to what it is today only because we got people to contribute. Please, continue in that vein.

By posting to the Wiki, every author has agreed to have his / her work modified later on. The Wiki holds a history, and if your edit is considered faulty, there are discussion pages as well as the ability to revert changes, wholly or partially. You cannot "destroy" information in the Wiki.

So... edit. Please.
Every good solution is obvious once you've found it.
User avatar
BigBuda
Member
Member
Posts: 104
Joined: Fri Sep 03, 2021 5:20 pm

Re: An incorrect information in the Wiki

Post by BigBuda »

Solar wrote: By posting to the Wiki, every author has agreed to have his / her work modified later on. The Wiki holds a history, and if your edit is considered faulty, there are discussion pages as well as the ability to revert changes, wholly or partially. You cannot "destroy" information in the Wiki.
This is also a very good reason to use citations profusely, so that information can be cross-referenced and validated, and if any source is incorrect, the corrections can be escalated upstream in order to reduce misinformation.

But yes, wikis should be ever evolving with new information, new knowledge, corrections, adaptations to more contemporary situations and so forth.
Writing a bootloader in under 15 minutes: https://www.youtube.com/watch?v=0E0FKjvTA0M
Post Reply