Extrange function behaviour (optimizer, i guess)
-
- Posts: 11
- Joined: Mon Aug 08, 2016 7:56 am
Extrange function behaviour (optimizer, i guess)
Hi, these days I have been implementing a printf function, however sometimes it does strange things:
When i do something like:
printf("This year is 2016\n");
printf("Success!!");
It prints this:
This year is 2016Success!!
But if i do it using the printf return value it works fine:
int rv = printf("This year is 2016\n");
printf("Success!!");
It prints this:
This year is 2016
Success!!
Any idea of why is this happening?
When i do something like:
printf("This year is 2016\n");
printf("Success!!");
It prints this:
This year is 2016Success!!
But if i do it using the printf return value it works fine:
int rv = printf("This year is 2016\n");
printf("Success!!");
It prints this:
This year is 2016
Success!!
Any idea of why is this happening?
Re: Extrange function behaviour (optimizer, i guess)
Open your binary in a disassembler and start reading the assembly?
Compile with -O0 just to see if its really the compiler
Compile with -O0 just to see if its really the compiler
-
- Member
- Posts: 5587
- Joined: Mon Mar 25, 2013 7:01 pm
Re: Extrange function behaviour (optimizer, i guess)
GCC understands standard C library functions and automatically substitutes them with simpler code (including entirely different functions!) when possible. Multiple printf() calls may be combined into a single one, printf() calls with a constant string ending in a newline might be replaced with puts(), and so on.
- Schol-R-LEA
- Member
- Posts: 1925
- Joined: Fri Oct 27, 2006 9:42 am
- Location: Athens, GA, USA
Re: Extrange function behaviour (optimizer, i guess)
@octocontrabass: That may be, but that in itself doesn't explain the difference in behavior; if the compiler is merging the printf() calls' strings, then the newline should still be there, just in the middle of the string. However, this may still be a useful clue as to the cause, if the printf() implementation in question treats medial newlines differently from trailing newlines.
@DevilGladiator: You should be able to come up with a few test cases for your printf() implementation given what octocontrabass just said; see if you can pin down the behavior of the function given different strings, with different escape codes in different positions. You might want to post the code - or better still, a link to your remote code repo (you do have your code under remote version control, at a host such as GitHub or CloudForge, don't you? If not, get it under it ASAP - seriously, drop everything and do it now, every minute you wait is a chance that you'll lose code).
Also, it sounds as if you are implementing the printf() as kernel code, and as a single piece, writing to the text buffer directly. This is a Bad Idea, for a number of reasons; while it isn't catastrophic in the early part of kernel design, it will come to haunt you later. My recommendations on this are a) rename all the functions to something like kprintf()/kputs() or printk()/putk(), so that it is immediately clear that they are the kernel implementations, and b) separate the string-formatting code and the text-display code into an ksprintf() and kputs(), respectively, and have the kprintf() function call those. It is a common mistake we've seen often enough here, so you shouldn't beat yourself up over it, but you do want to learn from our mistakes when you can.
kputs() in turn should call a kputchar() when it gets a string with just a few characters, but with larger strings should format them to a local buffer (since the standard text buffer consists of character-attribute pairs) and using a memory-to-memory move. It doesn't have to be a full double buffer, though that may be advisable.
On a related note, take into consideration the likely need for buffered output, and the possibility that you will have to redirect output to different text pages or even to something other than the console (e.g., a log file). Even in the kernel, you want to separate as many operations as you feasibly can to make it easier to compose them in multiple ways later, and only combining them later if absolutely necessary for performance reasons.
@DevilGladiator: You should be able to come up with a few test cases for your printf() implementation given what octocontrabass just said; see if you can pin down the behavior of the function given different strings, with different escape codes in different positions. You might want to post the code - or better still, a link to your remote code repo (you do have your code under remote version control, at a host such as GitHub or CloudForge, don't you? If not, get it under it ASAP - seriously, drop everything and do it now, every minute you wait is a chance that you'll lose code).
Also, it sounds as if you are implementing the printf() as kernel code, and as a single piece, writing to the text buffer directly. This is a Bad Idea, for a number of reasons; while it isn't catastrophic in the early part of kernel design, it will come to haunt you later. My recommendations on this are a) rename all the functions to something like kprintf()/kputs() or printk()/putk(), so that it is immediately clear that they are the kernel implementations, and b) separate the string-formatting code and the text-display code into an ksprintf() and kputs(), respectively, and have the kprintf() function call those. It is a common mistake we've seen often enough here, so you shouldn't beat yourself up over it, but you do want to learn from our mistakes when you can.
kputs() in turn should call a kputchar() when it gets a string with just a few characters, but with larger strings should format them to a local buffer (since the standard text buffer consists of character-attribute pairs) and using a memory-to-memory move. It doesn't have to be a full double buffer, though that may be advisable.
On a related note, take into consideration the likely need for buffered output, and the possibility that you will have to redirect output to different text pages or even to something other than the console (e.g., a log file). Even in the kernel, you want to separate as many operations as you feasibly can to make it easier to compose them in multiple ways later, and only combining them later if absolutely necessary for performance reasons.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
-
- Member
- Posts: 5587
- Joined: Mon Mar 25, 2013 7:01 pm
Re: Extrange function behaviour (optimizer, i guess)
And if it's replacing one of the printf() calls with puts(), then the real problem is puts() not inserting a newline like it's supposed to.Schol-R-LEA wrote:@octocontrabass: That may be, but that in itself doesn't explain the difference in behavior; if the compiler is merging the printf() calls' strings, then the newline should still be there, just in the middle of the string.
- Schol-R-LEA
- Member
- Posts: 1925
- Joined: Fri Oct 27, 2006 9:42 am
- Location: Athens, GA, USA
Re: Extrange function behaviour (optimizer, i guess)
Just so.
To clarify on the last part of my previous post: what I should have said more explicitly is that even the kputs() probably shouldn't be writing to the text buffer directly, but should instead prepare the text buffers and then perform a case analysis (i.e., a switch()) on which low-level text output function it should call. You may want to short-circuit this for some cases later, but for now having a low-level show_text() which takes a prepared buffer and a struct representing the current text page and it's state (# of rows, # of columns, background color, text attribute at cursor, text attribute mode, cursor attributes, cursor x and y positions, overflow mode, and so forth) would be a good way to separate those concerns, though the separation would not be total, given the nature of these things.
One more thing: this is all regarding the kernel level interface with the hardware, and the show_text() function will probably become the interface between the kernel and a console driver eventually. The kernel system calls should not expose the kprintf(), etc. functions, but should expose a front-end to that driver interface. The userland libraries will use those, rather than the kernel-level formatting functions, for their own formatted I/O functions.
To clarify on the last part of my previous post: what I should have said more explicitly is that even the kputs() probably shouldn't be writing to the text buffer directly, but should instead prepare the text buffers and then perform a case analysis (i.e., a switch()) on which low-level text output function it should call. You may want to short-circuit this for some cases later, but for now having a low-level show_text() which takes a prepared buffer and a struct representing the current text page and it's state (# of rows, # of columns, background color, text attribute at cursor, text attribute mode, cursor attributes, cursor x and y positions, overflow mode, and so forth) would be a good way to separate those concerns, though the separation would not be total, given the nature of these things.
One more thing: this is all regarding the kernel level interface with the hardware, and the show_text() function will probably become the interface between the kernel and a console driver eventually. The kernel system calls should not expose the kprintf(), etc. functions, but should expose a front-end to that driver interface. The userland libraries will use those, rather than the kernel-level formatting functions, for their own formatted I/O functions.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
-
- Posts: 11
- Joined: Mon Aug 08, 2016 7:56 am
Re: Extrange function behaviour (optimizer, i guess)
Well at the moment i have implemented all the printf -> vprintf -> vfprintf -> fputs -> write sequence, but I don't have real system calls or a virtual file system so when i call write with the file descriptor 2 it justs prints the content of the string, on the other hand I keep using the same structure of the meaty skeleton tutorial (the libc generates a libk for the kernel and a libcc for user mode)Schol-R-LEA wrote:@octocontrabass: That may be, but that in itself doesn't explain the difference in behavior; if the compiler is merging the printf() calls' strings, then the newline should still be there, just in the middle of the string. However, this may still be a useful clue as to the cause, if the printf() implementation in question treats medial newlines differently from trailing newlines.
@DevilGladiator: You should be able to come up with a few test cases for your printf() implementation given what octocontrabass just said; see if you can pin down the behavior of the function given different strings, with different escape codes in different positions. You might want to post the code - or better still, a link to your remote code repo (you do have your code under remote version control, at a host such as GitHub or CloudForge, don't you? If not, get it under it ASAP - seriously, drop everything and do it now, every minute you wait is a chance that you'll lose code).
Also, it sounds as if you are implementing the printf() as kernel code, and as a single piece, writing to the text buffer directly. This is a Bad Idea, for a number of reasons; while it isn't catastrophic in the early part of kernel design, it will come to haunt you later. My recommendations on this are a) rename all the functions to something like kprintf()/kputs() or printk()/putk(), so that it is immediately clear that they are the kernel implementations, and b) separate the string-formatting code and the text-display code into an ksprintf() and kputs(), respectively, and have the kprintf() function call those. It is a common mistake we've seen often enough here, so you shouldn't beat yourself up over it, but you do want to learn from our mistakes when you can.
kputs() in turn should call a kputchar() when it gets a string with just a few characters, but with larger strings should format them to a local buffer (since the standard text buffer consists of character-attribute pairs) and using a memory-to-memory move. It doesn't have to be a full double buffer, though that may be advisable.
On a related note, take into consideration the likely need for buffered output, and the possibility that you will have to redirect output to different text pages or even to something other than the console (e.g., a log file). Even in the kernel, you want to separate as many operations as you feasibly can to make it easier to compose them in multiple ways later, and only combining them later if absolutely necessary for performance reasons.
Anyways here is my printf code in case someone is interested on it(yeah I know it isn't the best implementation ever but it is more or less complete (dont support wchar_t or aA gG format specifiers)):
You can see the whole code here: https://github.com/DevilGladiator/XeonOS
Printf:
Code: Select all
/*
* Author: manuel
* E-mail [email protected]
* Created: 2016-08-11 15:44:00
*
* File: printf.c
* Description: Implements the C standard printf function
*/
#include <stdio.h>
#include <stdarg.h>
int printf(const char* __restrict format, ...)
{
int lenght;
va_list parameters;
va_start(parameters, format);
lenght = vfprintf(stdout, format, parameters);
va_end(parameters);
return lenght;
}
Code: Select all
* File: vfprintf.c
* Description: The vfprintf function prints a formatted argument list to a file stream
*/
#include <stdio.h>
#include <stdarg.h>
int vfprintf (FILE * file_stream, const char* __restrict format, va_list arguments)
{
int lenght;
char buffer[1024];
lenght = vsprintf (buffer, format, arguments);
fputs (buffer, file_stream);
return lenght;
}
Code: Select all
int vsprintf(char* buffer, const char* format, va_list arguments)
{
int count = 0, min_width, precission, base, auxiliar;
char format_flags, data_flags, data_specifier[4], print_buffer[200];
bool dot_pressent, uppercase;
size_t i;
memset(buffer, 0, strlen(buffer));
memset(data_specifier, 0, strlen(data_specifier));
for(; *format;)
{
if(*format == '%')
{
format_flags = data_flags = base = auxiliar = dot_pressent = uppercase = 0;
min_width = precission = -1;
memset(print_buffer, 0, 100);
//Advance a char
format++;
for(; *format; format++)
{
if(strchr( "csdioxXufFeEaAgGnp%", *format ) != NULL)
break;
//Set the format modifier flags
if(*format == format_modifiers[0].string[0])
set_flag(format_flags, format_modifiers[0].flag );
else if(*format == format_modifiers[1].string[0])
set_flag(format_flags, format_modifiers[1].flag );
else if(*format == format_modifiers[2].string[0])
set_flag(format_flags, format_modifiers[2].flag );
else if(*format == format_modifiers[3].string[0])
set_flag(format_flags, format_modifiers[3].flag );
else if(*format == format_modifiers[4].string[0])
set_flag(format_flags, format_modifiers[4].flag );
//If a * is found in the format string its value is pulled from the va_list
else if(*format == '*')
{
auxiliar = va_arg(arguments, int);
if(dot_pressent)
precission = auxiliar;
else
min_width = auxiliar;
}
else if(*format == '.')
dot_pressent = true;
else if(isdigit(*format))
{
auxiliar = strtol(format, (char**)&format, 10);
format--;
if(dot_pressent)
precission = auxiliar;
else
min_width = auxiliar;
}
//Search data modifiers
else if(isalpha(*format))
{
for(i = 0; strchr( "hljztL", *format ) != NULL; strncat(data_specifier, format, 1) ,format++ , i++);
format--;
//Check each data modifier to se if there is a match
for(size_t j = 0; j < 8; j++)
if(strncmp(data_specifier, data_modifiers[j].string, i) == 0)
{
set_flag(data_flags, data_modifiers[j].flag);
break;
}
}
else
return -1;
}
//Now process each data type specifier
//Note that wchar_t is not supported
if(*format == 'c')
print_buffer[0] = (char)va_arg(arguments, int);
else if(*format == 's')
{
strcat(print_buffer, va_arg(arguments, char*));
if(precission != -1)
print_buffer[precission - 1] = 0;
adjust_size(print_buffer, format_flags, min_width);
}
else if(strchr("dioxXu", *format) != NULL)
{
switch(*format)
{
case 'd':
case 'i':
base = -10;
break;
case 'o':
base = 8;
break;
case 'x':
base = 16;
break;
case 'X':
base = 16;
uppercase = true;
break;
case 'u':
base = 10;
break;
}
if( data_flags == 0 )
lltoa( va_arg(arguments, int), print_buffer, base );
else if( check_flag(data_flags, data_modifiers[0].flag))
lltoa( (char)va_arg(arguments, int), print_buffer, base );
else if( check_flag(data_flags, data_modifiers[1].flag))
lltoa( (short)va_arg(arguments, int), print_buffer, base );
else if( check_flag(data_flags, data_modifiers[2].flag))
lltoa( va_arg(arguments, long), print_buffer, base );
else if( check_flag(data_flags, data_modifiers[3].flag))
lltoa( va_arg(arguments, long long), print_buffer, base );
else if( check_flag(data_flags, data_modifiers[4].flag))
lltoa( va_arg(arguments, intmax_t), print_buffer, base );
else if( check_flag(data_flags, data_modifiers[5].flag))
lltoa( va_arg(arguments, size_t), print_buffer, base );
else
return -1;
format_number(print_buffer, format_flags, min_width, precission, base, false, uppercase );
}
else if(strchr("fFeE", *format) != NULL)
{
bool exponent_notation = false;
switch(*format)
{
case 'F':
uppercase = true;
case 'E':
uppercase = true;
case 'e':
exponent_notation = true;
break;
default:
break;
}
if(precission == -1)
precission = 6;
if( data_flags == 0 || check_flag(data_flags, data_modifiers[2].flag))
ldtoa( va_arg(arguments, double), print_buffer, precission, exponent_notation);
else if( check_flag(data_flags, data_modifiers[6].flag))
ldtoa( va_arg(arguments, long double), print_buffer, precission, exponent_notation);
format_number(print_buffer, format_flags, min_width, precission, base, (*format != 'e' || *format != 'E'), uppercase );
}
else if(*format == 'p')
{
lltoa( (int)va_arg(arguments, void*), print_buffer, 16);
format_number(print_buffer, format_flags, min_width, precission, base, false, uppercase );
}
strcat(buffer, print_buffer);
count += strlen(print_buffer);
buffer = buffer + strlen(print_buffer);
format++;
}
else
{
*buffer = *format;
buffer++;
count++;
format++;
}
//Make sure that the string ends with a null char
*buffer = 0;
}
return count;
}
void format_number(char* buffer, short format_flags, int min_width, int precission, int base , bool integer ,bool uppercase)
{
//Append a + if the number is positive
if( check_flag(format_flags, format_modifiers[1].flag) && !has_sign_char(buffer) )
pad_with(buffer, '+', 1, true);
//Pad with zeros if not enough precission
if( precission > (int)strlen(buffer) )
{
char padding = precission - strlen(buffer);
if(has_sign_char(buffer))
pad_with(buffer + 1, '0', ++padding, integer);
else
pad_with(buffer , '0', padding, integer);
}
//Append a blank space at the beggining of the string if there is no sign characeter
if( check_flag(format_flags, format_modifiers[2].flag) && !has_sign_char(buffer))
pad_with(buffer, ' ', 1, true);
//Adjust the size of the string
adjust_size(buffer, format_flags, min_width);
//Append a 0x to the bool numbers
if( check_flag(format_flags, format_modifiers[3].flag) && base == 16 )
{
pad_with(buffer, 'x', 1, true);
pad_with(buffer, '0', 1, true);
}
if(uppercase)
string_to_upper(buffer);
}
Code: Select all
int fputs(const char* string, FILE* stream)
{
return fwrite( (void*)string, strlen(string), 1, stream);
}
Code: Select all
* Description: Implementation of the fwrite finction that puts data on a stream
*/
#include <stdio.h>
#include <stddef.h>
#include <stdint.h>
#include <syscall.h>
/*
* Buffer, the data to write
* size, the number of bytes in each record
* number, the number of record to write
* stram, the output stream
*/
size_t fwrite(const void* buffer, size_t size, size_t number, FILE * stream)
{
size_t left, written, return_value;
if(!size || !number )
return 0;
left = size*number;
written = 0;
while(left > 0 && !stream->error)
{
return_value = write(stream->file_descriptor, buffer + written, left);
if( return_value <= 0)
stream->error = 1;
else
{
left -= return_value;
written += return_value;
}
}
return (written / size);
}
Code: Select all
*
* Description: System call to write to a file descriptor
*/
#include <stdio.h>
#include <syscall.h>
#include <stddef.h>
#if defined (__is_xeonos_kernel)
#include <kernel/tty.h>
#endif
size_t write(int file_descriptor, void* buffer, size_t number)
{
if( file_descriptor < 0 || number == 0 || buffer == NULL)
return 0;
#if defined (__is_xeonos_kernel)
if(file_descriptor == 1)
{
terminal_write((char*)buffer, number);
return number;
}
#else
#endif
return -1;
}
-
- Posts: 11
- Joined: Mon Aug 08, 2016 7:56 am
Re: Extrange function behaviour (optimizer, i guess)
That was it! I forgot to append a new line char to the puts function, thank you!Octocontrabass wrote:GCC understands standard C library functions and automatically substitutes them with simpler code (including entirely different functions!) when possible. Multiple printf() calls may be combined into a single one, printf() calls with a constant string ending in a newline might be replaced with puts(), and so on.
Re: Extrange function behaviour (optimizer, i guess)
You could have saved a lot of time by reading Octocontrabass' reply. He knows these things. The bug is in your puts().
You have
You needor equivalent
EDIT: Today is the day for race conditions
You have
Code: Select all
#include <stdio.h>
int puts(const char* string)
{
return fputs(string, stdout);
}
Code: Select all
include <stdio.h>
int puts(const char *s)
{
return fprintf(stdout, "%s\n", s);
}
EDIT: Today is the day for race conditions
If a trainstation is where trains stop, what is a workstation ?
-
- Member
- Posts: 45
- Joined: Wed Dec 25, 2013 11:51 am
Re: Extrange function behaviour (optimizer, i guess)
I have some C library functions implemented, but not all. One way I avoid this "problem" is using the flag -fno-builtin or -fno-builtin-<function>Octocontrabass wrote:GCC understands standard C library functions and automatically substitutes them with simpler code (including entirely different functions!) when possible. Multiple printf() calls may be combined into a single one, printf() calls with a constant string ending in a newline might be replaced with puts(), and so on.
https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
Machina - https://github.com/brunexgeek/machina