Page 1 of 1

An incorrect information in the Wiki

Posted: Sat Jul 27, 2019 10:29 am
by Clover5411
Well, I'm not sure if this is the right place for this, but basically there is an incorrect information in the wiki. It's in the page "Tail Recursion and Tail Call Optimization" (https://wiki.osdev.org/Tail_Recursion_a ... timization)

In this page, it's written that a tail-call is when a call occurs in the end of a function. This is incorrect, a tail-call is when, in assembly, a CALL occurs before RET.

Code: Select all

 unsigned long long Factorial(unsigned start) {
    if (start > 1) {
       return Factorial(start - 1) * start;
    } // if
    return 1;
 } // Factorial(start)
Now, according to the page, there is no tail-call here (which is correct, but for a very different reason. I'll get back to this)

Now the page also writes:

Code: Select all

unsigned long long Factorial(unsigned start) {
    if (start <= 1) {
       return 1;
    } // if
    return start * Factorial(start - 1);
 } // Factorial(start)
The page claims there is a tail-call here, which is incorrect, there is none. As I said a tail-call is when a CALL occurs before RET. I'll just use pseudo-assembly for simplicity (since CALL, RET and JMP are more or less the same in most architectures it doesn't matter much):

Code: Select all

CMP start,1
JMP.HI _endofblock
MOV return_value, 1
RET
_endofblock:
SUB tmp, start, 1
PUSH tmp
CALL Factorial
MUL return_value, return_value, start
RET
As you can see there is a MUL between CALL and RET. As a result, there is no tail-call. The same logic also applies to the above code, as that code will also result in a MUL between CALL and RET.

Now an example of a tail-call would include:

Code: Select all

function foo(data) {
    a(data);
    return b(data);
}
In assembly, this is:

Code: Select all

PUSH data
CALL a
PUSH data #assuming this is needed
CALL b
RET
Now this can be tail-call optimized, by combining the CALL and RET to simply JMP.

Code: Select all

PUSH data
CALL a
PUSH data #assuming this is needed
JMP b
The new code will behave in the exact same way, with a minor and desirable difference, it won't push IP/PC to the stack, which is a great thing if the said tail-call is also a recursive call (though it isn't in this case)

Contrary to what the said page claims, a tail call doesn't need to be at the tail of the code:

Code: Select all

function bar(data) {
    if ( a(data) ) {
        return b(data);
    }
    return c(data);
}
In assembly:

Code: Select all

PUSH data
CALL a
TEST return_value
JMP.FALSE _endofblock
PUSH data
CALL b
RET
_endofblock:
PUSH data
CALL c
RET
Here, both b and c are tail-calls even though only c is at the tail, because both result in a case where CALL and RET follow each other, and can be optimized to:

Code: Select all

PUSH data
CALL a
TEST return_value
JMP.FALSE _endofblock
PUSH data
JMP b
_endofblock:
PUSH data
JMP c
Another example, where the call isn't at the tail, but nonetheless is a tail-call:

Code: Select all

function foo()
{
       int myInteger = bar();
       return myInteger;
}
In assembly:

Code: Select all

CALL bar
MOV myInteger, return_value
MOV return_value, myInteger
RET
We can first optimize the unnecessary MOVs, getting rid of myInteger:

Code: Select all

CALL bar
RET
And then simply apply tail-call optimization:

Code: Select all

JMP bar
On the contrary, even if the statement is at the tail, it may not be a tail-call, and the code in the said page is a great example for that:

Code: Select all

unsigned long long Factorial(unsigned start) {
    if (start <= 1) {
       return 1;
    } // if
    return start * Factorial(start - 1);
 } // Factorial(start)

Code: Select all

CMP start,1
JMP.HI _endofblock
MOV return_value, 1
RET
_endofblock:
SUB tmp, start, 1
PUSH tmp
CALL Factorial
MUL return_value, return_value, start
RET
As you can see, this code results in a MUL between CALL and RET, therefore there is no tail-call, even though the call is actually on the tail of the function.

Now of course, this also depends on the exact architecture, but as I said, the way the instructions CALL, RET and JMP work are usually the same, so most of the above examples with pseudo-assembly would still work on most real architectures.

If you, the person reading this, are a moderator or anyone else with the privilege to modify the wiki, I ask of you to correct this page, please. Thanks in advance.

Source: https://en.wikipedia.org/wiki/Tail_call

Re: An incorrect information in the Wiki

Posted: Sat Jul 27, 2019 11:12 am
by xenos
Actually everybody can edit the wiki, so also you could do it if you want to. To get edit rights, you just need to go to the User Control Panel here in the forum, then to the tab User Groups, and join the wiki group.

Of course, you're right - nicely spotted!

Re: An incorrect information in the Wiki

Posted: Sat Jul 27, 2019 11:18 am
by Clover5411
Oh, didn't know everyone could do it. Thanks. :D

Re: An incorrect information in the Wiki

Posted: Sun Jul 28, 2019 2:44 pm
by Schol-R-LEA
I wasn't even aware that there was a page on this here. According to the history, it was posted in January of this year, and I am guessing no one except the OP (Johnburger) had noticed it until now.

FlandreScarlet: You're right, the example as given is not a tail call at all. The OP seems to think that the position of the call on the line of source code is what is relevant, which is not at all the case - it has to do with it being the last action in the generated code before the function exits and returns. The real relevant factor is whether the activation record (or stack frame, or local environment - take your pick, those terms all amount to almost the same thing) on the call stack can be reused without losing any necessary information, and in this case, the answer is no.

There is a related optimization which is referred to as a 'tail recursion modulo cons' which could be applied here, but it is significantly more complex to for the compiler writers to implement. While a discussion of it might be appropriate, as it is the example is entirely incorrect.

The funny thing is, in Scheme textbooks (which I am guessing where Johnburger got this, as it seems like a garbled version of a common example), the reverse is usually given, replacing this linear recursion:

Code: Select all

; yes, I am aware that factorial is only defined for positive integers, 
; but I wanted to keep it simple
(define (factorial-1 n)
  (if (<= n 1)
    1
     (* n (factorial-1 (- n 1)))))
with this 'linear iteration' (i.e., a recursion suitable for TCO, which in Scheme terms is considered iteration because it gets optimized into one):

Code: Select all

; note that the 'named let' basically creates a special-purpose internal function;
; it's that function 'fact' which is recursing in this case.
(define (factorial-2 n)
  (let fact ((product 1)
               (counter 1))
    (if (> counter n)
        product
        (fact (* product counter) (+ 1 counter))))) 
However, this approach to iteration isn't necessary, or even particularly applicable, in languages which have built-in iterative operators such as while() and for() (or even a standard iteration macro, such as Common Lisp's (loop)). In fact, even Scheme has one, (do), though it's rather odd:

Code: Select all

(define (factorial-3 n)
  (do ((counter 1 (+ 1 counter))
       (product 1 (* product counter)))
      ((> counter n) product)
  ;; note that there is no loop body in this case,
  ;; as the iteration clauses do all the heavy lifting
  )) 
In any case, actual idiomatic Scheme would really be do this (she said, tongue firmly in cheek) with an accumulator applied to a stream (a lazy list). Assuming one got that far in learning the language, that is. Note that standardization of SRFI library names between implementations is... problematic.

Code: Select all

#!r6rs
; at this point I have probably lost everyone anyway, so a detailed explanation is probably fruitless.
; I will mention that I've tested this in Guile, so at least one implementation can use it...
(import (srfi :41))

(define (factorial-4 n)
  (if (<= n 0)
      1
      (let ((range (stream-range 1 (+ 1 n))))
        (stream-fold
         (lambda (x y)
           (* x y))
         (stream-car range) (stream-cdr range)))))
Enough of that, I think.

Re: An incorrect information in the Wiki

Posted: Sun Jul 28, 2019 8:05 pm
by Clover5411
Interesting. Thank you for putting this information here.

Although factorial isn't strictly defined only for positive integers.

Usually, it is also defined for zero where:

0! = 1

There is also more generalized definition with the gamma function, which can also be applied to non-integers.

So it depends on the exact definition for factorial you are using.

Not really an important matter, but nerd instincts kicked in. :mrgreen:

Re: An incorrect information in the Wiki

Posted: Wed Aug 28, 2019 10:37 pm
by linguofreak
FlandreScarlet wrote:Interesting. Thank you for putting this information here.

Although factorial isn't strictly defined only for positive integers.

Usually, it is also defined for zero where:

0! = 1

There is also more generalized definition with the gamma function, which can also be applied to non-integers.

So it depends on the exact definition for factorial you are using.

Not really an important matter, but nerd instincts kicked in. :mrgreen:
The really interesting thing is that if you use the definition factorial=gamma(x+1), the only complex numbers for which factorial is not defined are real integers (specifically, the negative integers).

So strictly speaking, factorial is defined for other numbers than the natural numbers over any domain that is a superset of the real integers, but over the real integers is defined only for the naturals.

Re: An incorrect information in the Wiki

Posted: Tue Mar 28, 2023 12:08 am
by Clover5411
Okay, so this took me way longer than it should have. The reason being a mixture of laziness and fearing my own ineptitude. That being said I finally had both the motivation and confidence to put up a rudimentary explanation of tail call optimization. Unfortunately, I couldn't get into details about prologues and epilogues, and how this would effect optimization. I do have some idea on how it could work, but I don't want to add misinformation.

I suppose I should also ask, can I even delete stuff from the main page? I don't know how to do that, so I took what felt like the most reasonable action at the time; but in retrospect, it was a bit childish. (EDIT: Nevermind, it was surprisingly simple... Truth be told, temptation to delete that page and this topic is strong...)

Re: An incorrect information in the Wiki

Posted: Thu Mar 30, 2023 2:12 pm
by SeaLiteral
KineticManiac wrote:Okay, so this took me way longer than it should have. The reason being a mixture of laziness and fearing my own ineptitude.
I haven't had the confidence to change much on the wiki myself either. There's even this "if someone writes something and I make an edit to it, how will the person that initially wrote the text feel?" feeling that I probably shouldn't have that much of but I do. I feel like I know of several mistakes in the interrupts tutorial, but even though I see them as mistakes (or in one case, just doing a thing without properly explaining why you're doing it that way) I don't really feel like I have the confidence to change what someone else wrote.

Re: An incorrect information in the Wiki

Posted: Tue May 30, 2023 4:38 am
by Solar
Do, do, do edit! That's the lifeblood of a Wiki.

Over the years, many people who have written a lot in the Wiki drop out of the hobby, partially or completely. If you hold whatever they wrote sacred, sooner or later the Wiki will wither and die.

Those who wrote into the Wiki before you might have been before you, but that doesn't make them Elders. A Wiki is a cooperation thing.

In the beginning, it was one static HTML page with a couple of good hints. It grew to what it is today only because we got people to contribute. Please, continue in that vein.

By posting to the Wiki, every author has agreed to have his / her work modified later on. The Wiki holds a history, and if your edit is considered faulty, there are discussion pages as well as the ability to revert changes, wholly or partially. You cannot "destroy" information in the Wiki.

So... edit. Please.

Re: An incorrect information in the Wiki

Posted: Tue May 30, 2023 6:28 am
by BigBuda
Solar wrote: By posting to the Wiki, every author has agreed to have his / her work modified later on. The Wiki holds a history, and if your edit is considered faulty, there are discussion pages as well as the ability to revert changes, wholly or partially. You cannot "destroy" information in the Wiki.
This is also a very good reason to use citations profusely, so that information can be cross-referenced and validated, and if any source is incorrect, the corrections can be escalated upstream in order to reduce misinformation.

But yes, wikis should be ever evolving with new information, new knowledge, corrections, adaptations to more contemporary situations and so forth.