An incorrect information in the Wiki
Posted: Sat Jul 27, 2019 10:29 am
Well, I'm not sure if this is the right place for this, but basically there is an incorrect information in the wiki. It's in the page "Tail Recursion and Tail Call Optimization" (https://wiki.osdev.org/Tail_Recursion_a ... timization)
In this page, it's written that a tail-call is when a call occurs in the end of a function. This is incorrect, a tail-call is when, in assembly, a CALL occurs before RET.
Now, according to the page, there is no tail-call here (which is correct, but for a very different reason. I'll get back to this)
Now the page also writes:
The page claims there is a tail-call here, which is incorrect, there is none. As I said a tail-call is when a CALL occurs before RET. I'll just use pseudo-assembly for simplicity (since CALL, RET and JMP are more or less the same in most architectures it doesn't matter much):
As you can see there is a MUL between CALL and RET. As a result, there is no tail-call. The same logic also applies to the above code, as that code will also result in a MUL between CALL and RET.
Now an example of a tail-call would include:
In assembly, this is:
Now this can be tail-call optimized, by combining the CALL and RET to simply JMP.
The new code will behave in the exact same way, with a minor and desirable difference, it won't push IP/PC to the stack, which is a great thing if the said tail-call is also a recursive call (though it isn't in this case)
Contrary to what the said page claims, a tail call doesn't need to be at the tail of the code:
In assembly:
Here, both b and c are tail-calls even though only c is at the tail, because both result in a case where CALL and RET follow each other, and can be optimized to:
Another example, where the call isn't at the tail, but nonetheless is a tail-call:
In assembly:
We can first optimize the unnecessary MOVs, getting rid of myInteger:
And then simply apply tail-call optimization:
On the contrary, even if the statement is at the tail, it may not be a tail-call, and the code in the said page is a great example for that:
As you can see, this code results in a MUL between CALL and RET, therefore there is no tail-call, even though the call is actually on the tail of the function.
Now of course, this also depends on the exact architecture, but as I said, the way the instructions CALL, RET and JMP work are usually the same, so most of the above examples with pseudo-assembly would still work on most real architectures.
If you, the person reading this, are a moderator or anyone else with the privilege to modify the wiki, I ask of you to correct this page, please. Thanks in advance.
Source: https://en.wikipedia.org/wiki/Tail_call
In this page, it's written that a tail-call is when a call occurs in the end of a function. This is incorrect, a tail-call is when, in assembly, a CALL occurs before RET.
Code: Select all
unsigned long long Factorial(unsigned start) {
if (start > 1) {
return Factorial(start - 1) * start;
} // if
return 1;
} // Factorial(start)
Now the page also writes:
Code: Select all
unsigned long long Factorial(unsigned start) {
if (start <= 1) {
return 1;
} // if
return start * Factorial(start - 1);
} // Factorial(start)
Code: Select all
CMP start,1
JMP.HI _endofblock
MOV return_value, 1
RET
_endofblock:
SUB tmp, start, 1
PUSH tmp
CALL Factorial
MUL return_value, return_value, start
RET
Now an example of a tail-call would include:
Code: Select all
function foo(data) {
a(data);
return b(data);
}
Code: Select all
PUSH data
CALL a
PUSH data #assuming this is needed
CALL b
RET
Code: Select all
PUSH data
CALL a
PUSH data #assuming this is needed
JMP b
Contrary to what the said page claims, a tail call doesn't need to be at the tail of the code:
Code: Select all
function bar(data) {
if ( a(data) ) {
return b(data);
}
return c(data);
}
Code: Select all
PUSH data
CALL a
TEST return_value
JMP.FALSE _endofblock
PUSH data
CALL b
RET
_endofblock:
PUSH data
CALL c
RET
Code: Select all
PUSH data
CALL a
TEST return_value
JMP.FALSE _endofblock
PUSH data
JMP b
_endofblock:
PUSH data
JMP c
Code: Select all
function foo()
{
int myInteger = bar();
return myInteger;
}
Code: Select all
CALL bar
MOV myInteger, return_value
MOV return_value, myInteger
RET
Code: Select all
CALL bar
RET
Code: Select all
JMP bar
Code: Select all
unsigned long long Factorial(unsigned start) {
if (start <= 1) {
return 1;
} // if
return start * Factorial(start - 1);
} // Factorial(start)
Code: Select all
CMP start,1
JMP.HI _endofblock
MOV return_value, 1
RET
_endofblock:
SUB tmp, start, 1
PUSH tmp
CALL Factorial
MUL return_value, return_value, start
RET
Now of course, this also depends on the exact architecture, but as I said, the way the instructions CALL, RET and JMP work are usually the same, so most of the above examples with pseudo-assembly would still work on most real architectures.
If you, the person reading this, are a moderator or anyone else with the privilege to modify the wiki, I ask of you to correct this page, please. Thanks in advance.
Source: https://en.wikipedia.org/wiki/Tail_call