Extrange function behaviour (optimizer, i guess)

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
DevilGladiator
Posts: 11
Joined: Mon Aug 08, 2016 7:56 am

Extrange function behaviour (optimizer, i guess)

Post by DevilGladiator »

Hi, these days I have been implementing a printf function, however sometimes it does strange things:

When i do something like:
printf("This year is 2016\n");
printf("Success!!");

It prints this:
This year is 2016Success!!

But if i do it using the printf return value it works fine:
int rv = printf("This year is 2016\n");
printf("Success!!");
It prints this:
This year is 2016
Success!!

Any idea of why is this happening?
User avatar
Ch4ozz
Member
Member
Posts: 170
Joined: Mon Jul 18, 2016 2:46 pm
Libera.chat IRC: esi

Re: Extrange function behaviour (optimizer, i guess)

Post by Ch4ozz »

Open your binary in a disassembler and start reading the assembly?
Compile with -O0 just to see if its really the compiler
Octocontrabass
Member
Member
Posts: 5587
Joined: Mon Mar 25, 2013 7:01 pm

Re: Extrange function behaviour (optimizer, i guess)

Post by Octocontrabass »

GCC understands standard C library functions and automatically substitutes them with simpler code (including entirely different functions!) when possible. Multiple printf() calls may be combined into a single one, printf() calls with a constant string ending in a newline might be replaced with puts(), and so on.
User avatar
Schol-R-LEA
Member
Member
Posts: 1925
Joined: Fri Oct 27, 2006 9:42 am
Location: Athens, GA, USA

Re: Extrange function behaviour (optimizer, i guess)

Post by Schol-R-LEA »

@octocontrabass: That may be, but that in itself doesn't explain the difference in behavior; if the compiler is merging the printf() calls' strings, then the newline should still be there, just in the middle of the string. However, this may still be a useful clue as to the cause, if the printf() implementation in question treats medial newlines differently from trailing newlines.

@DevilGladiator: You should be able to come up with a few test cases for your printf() implementation given what octocontrabass just said; see if you can pin down the behavior of the function given different strings, with different escape codes in different positions. You might want to post the code - or better still, a link to your remote code repo (you do have your code under remote version control, at a host such as GitHub or CloudForge, don't you? If not, get it under it ASAP - seriously, drop everything and do it now, every minute you wait is a chance that you'll lose code).

Also, it sounds as if you are implementing the printf() as kernel code, and as a single piece, writing to the text buffer directly. This is a Bad Idea, for a number of reasons; while it isn't catastrophic in the early part of kernel design, it will come to haunt you later. My recommendations on this are a) rename all the functions to something like kprintf()/kputs() or printk()/putk(), so that it is immediately clear that they are the kernel implementations, and b) separate the string-formatting code and the text-display code into an ksprintf() and kputs(), respectively, and have the kprintf() function call those. It is a common mistake we've seen often enough here, so you shouldn't beat yourself up over it, but you do want to learn from our mistakes when you can.

kputs() in turn should call a kputchar() when it gets a string with just a few characters, but with larger strings should format them to a local buffer (since the standard text buffer consists of character-attribute pairs) and using a memory-to-memory move. It doesn't have to be a full double buffer, though that may be advisable.

On a related note, take into consideration the likely need for buffered output, and the possibility that you will have to redirect output to different text pages or even to something other than the console (e.g., a log file). Even in the kernel, you want to separate as many operations as you feasibly can to make it easier to compose them in multiple ways later, and only combining them later if absolutely necessary for performance reasons.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Octocontrabass
Member
Member
Posts: 5587
Joined: Mon Mar 25, 2013 7:01 pm

Re: Extrange function behaviour (optimizer, i guess)

Post by Octocontrabass »

Schol-R-LEA wrote:@octocontrabass: That may be, but that in itself doesn't explain the difference in behavior; if the compiler is merging the printf() calls' strings, then the newline should still be there, just in the middle of the string.
And if it's replacing one of the printf() calls with puts(), then the real problem is puts() not inserting a newline like it's supposed to. :wink:
User avatar
Schol-R-LEA
Member
Member
Posts: 1925
Joined: Fri Oct 27, 2006 9:42 am
Location: Athens, GA, USA

Re: Extrange function behaviour (optimizer, i guess)

Post by Schol-R-LEA »

Just so.

To clarify on the last part of my previous post: what I should have said more explicitly is that even the kputs() probably shouldn't be writing to the text buffer directly, but should instead prepare the text buffers and then perform a case analysis (i.e., a switch()) on which low-level text output function it should call. You may want to short-circuit this for some cases later, but for now having a low-level show_text() which takes a prepared buffer and a struct representing the current text page and it's state (# of rows, # of columns, background color, text attribute at cursor, text attribute mode, cursor attributes, cursor x and y positions, overflow mode, and so forth) would be a good way to separate those concerns, though the separation would not be total, given the nature of these things.

One more thing: this is all regarding the kernel level interface with the hardware, and the show_text() function will probably become the interface between the kernel and a console driver eventually. The kernel system calls should not expose the kprintf(), etc. functions, but should expose a front-end to that driver interface. The userland libraries will use those, rather than the kernel-level formatting functions, for their own formatted I/O functions.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
DevilGladiator
Posts: 11
Joined: Mon Aug 08, 2016 7:56 am

Re: Extrange function behaviour (optimizer, i guess)

Post by DevilGladiator »

Schol-R-LEA wrote:@octocontrabass: That may be, but that in itself doesn't explain the difference in behavior; if the compiler is merging the printf() calls' strings, then the newline should still be there, just in the middle of the string. However, this may still be a useful clue as to the cause, if the printf() implementation in question treats medial newlines differently from trailing newlines.

@DevilGladiator: You should be able to come up with a few test cases for your printf() implementation given what octocontrabass just said; see if you can pin down the behavior of the function given different strings, with different escape codes in different positions. You might want to post the code - or better still, a link to your remote code repo (you do have your code under remote version control, at a host such as GitHub or CloudForge, don't you? If not, get it under it ASAP - seriously, drop everything and do it now, every minute you wait is a chance that you'll lose code).

Also, it sounds as if you are implementing the printf() as kernel code, and as a single piece, writing to the text buffer directly. This is a Bad Idea, for a number of reasons; while it isn't catastrophic in the early part of kernel design, it will come to haunt you later. My recommendations on this are a) rename all the functions to something like kprintf()/kputs() or printk()/putk(), so that it is immediately clear that they are the kernel implementations, and b) separate the string-formatting code and the text-display code into an ksprintf() and kputs(), respectively, and have the kprintf() function call those. It is a common mistake we've seen often enough here, so you shouldn't beat yourself up over it, but you do want to learn from our mistakes when you can.

kputs() in turn should call a kputchar() when it gets a string with just a few characters, but with larger strings should format them to a local buffer (since the standard text buffer consists of character-attribute pairs) and using a memory-to-memory move. It doesn't have to be a full double buffer, though that may be advisable.

On a related note, take into consideration the likely need for buffered output, and the possibility that you will have to redirect output to different text pages or even to something other than the console (e.g., a log file). Even in the kernel, you want to separate as many operations as you feasibly can to make it easier to compose them in multiple ways later, and only combining them later if absolutely necessary for performance reasons.
Well at the moment i have implemented all the printf -> vprintf -> vfprintf -> fputs -> write sequence, but I don't have real system calls or a virtual file system so when i call write with the file descriptor 2 it justs prints the content of the string, on the other hand I keep using the same structure of the meaty skeleton tutorial (the libc generates a libk for the kernel and a libcc for user mode)

Anyways here is my printf code in case someone is interested on it(yeah I know it isn't the best implementation ever but it is more or less complete (dont support wchar_t or aA gG format specifiers)):
You can see the whole code here: https://github.com/DevilGladiator/XeonOS

Printf:

Code: Select all

/*
* Author: manuel
* E-mail [email protected]
* Created: 2016-08-11 15:44:00
* 
* File: printf.c
* Description: Implements the C standard printf function
*/

#include <stdio.h>
#include <stdarg.h>


int printf(const char* __restrict format, ...)
{
	int lenght;
	va_list parameters;
	va_start(parameters, format);

	lenght = vfprintf(stdout, format, parameters);

	va_end(parameters);

	return lenght;
}
vfprintf.c

Code: Select all

* File: vfprintf.c
* Description: The vfprintf function prints a formatted argument list to a file stream
*/
#include <stdio.h>
#include <stdarg.h>

int vfprintf (FILE * file_stream, const char*  __restrict format, va_list arguments)
{
	int  lenght;
	char buffer[1024];

	lenght = vsprintf (buffer, format, arguments);
	fputs (buffer, file_stream);

	return lenght;
}

Code: Select all

int	vsprintf(char* buffer, const char* format, va_list arguments)
{
	int count = 0, min_width, precission, base, auxiliar;
	char format_flags, data_flags, data_specifier[4], print_buffer[200];
	bool dot_pressent, uppercase;
	size_t i;
	
	memset(buffer, 0, strlen(buffer));
	memset(data_specifier, 0, strlen(data_specifier));
	
	for(; *format;)
	{
		if(*format == '%')
		{
			format_flags = data_flags = base = auxiliar = dot_pressent = uppercase = 0;
			min_width = precission = -1;
			memset(print_buffer, 0, 100);
			
			//Advance a char
			format++;
			
			for(; *format; format++)
			{
				if(strchr( "csdioxXufFeEaAgGnp%", *format ) != NULL)
					break;
				
				//Set the format modifier flags
				if(*format == format_modifiers[0].string[0])
					set_flag(format_flags, format_modifiers[0].flag );
				else if(*format == format_modifiers[1].string[0])
					set_flag(format_flags, format_modifiers[1].flag );
				else if(*format == format_modifiers[2].string[0])
					set_flag(format_flags, format_modifiers[2].flag );
				else if(*format == format_modifiers[3].string[0])
					set_flag(format_flags, format_modifiers[3].flag );
				else if(*format == format_modifiers[4].string[0])
					set_flag(format_flags, format_modifiers[4].flag );
				
				//If a * is found in the format string its value is pulled from the va_list
				else if(*format == '*')
				{
					auxiliar = va_arg(arguments, int);
					if(dot_pressent)
						precission = auxiliar;
					else
						min_width = auxiliar;
				}
				else if(*format == '.')
					dot_pressent = true;
				else if(isdigit(*format))
				{
					auxiliar = strtol(format, (char**)&format, 10);
					format--;
					if(dot_pressent)
						precission = auxiliar;
					else
						min_width = auxiliar;
				}
				
				//Search data modifiers
				else if(isalpha(*format))
				{
					for(i = 0; strchr( "hljztL", *format ) != NULL; strncat(data_specifier, format, 1) ,format++ , i++);
					format--;

					//Check each data modifier to se if there is a match
					for(size_t j = 0; j < 8; j++)
						if(strncmp(data_specifier, data_modifiers[j].string, i) == 0)
						{
							set_flag(data_flags, data_modifiers[j].flag);
							break;
						}

				}
				else
					return -1;
			}
			
			//Now process each data type specifier
			//Note that wchar_t is not supported
			
			if(*format == 'c')
				print_buffer[0] = (char)va_arg(arguments, int);
			
			else if(*format == 's')
			{
				strcat(print_buffer, va_arg(arguments, char*));
				if(precission != -1)
					print_buffer[precission - 1] = 0;
					
				adjust_size(print_buffer, format_flags, min_width);
			}
			
			else if(strchr("dioxXu", *format) != NULL)
			{
				switch(*format)
				{

					case 'd':
					case 'i':
						base = -10;
						break;
					case 'o':
						base = 8;
						break;
					case 'x':
						base = 16;
						break;
					case 'X':
						base = 16;
						uppercase = true;
						break;
					case 'u':
						base = 10;
						break;
				}
				
				if( data_flags == 0 )
					lltoa( va_arg(arguments, int), print_buffer, base );
				else if( check_flag(data_flags, data_modifiers[0].flag))
					lltoa( (char)va_arg(arguments, int), print_buffer, base );
				else if( check_flag(data_flags, data_modifiers[1].flag))
					lltoa( (short)va_arg(arguments, int), print_buffer, base );
				else if( check_flag(data_flags, data_modifiers[2].flag))
					lltoa( va_arg(arguments, long), print_buffer, base );
				else if( check_flag(data_flags, data_modifiers[3].flag))
					lltoa( va_arg(arguments, long long), print_buffer, base );
				else if( check_flag(data_flags, data_modifiers[4].flag))
					lltoa( va_arg(arguments, intmax_t), print_buffer, base );
				else if( check_flag(data_flags, data_modifiers[5].flag))
					lltoa( va_arg(arguments, size_t), print_buffer, base );
				else
					return -1;

				format_number(print_buffer, format_flags, min_width, precission, base, false, uppercase );

			}
			
			else if(strchr("fFeE", *format) != NULL)
			{
				bool exponent_notation = false;
				switch(*format)
				{
					case 'F':
						uppercase = true;
					case 'E':
						uppercase = true;
					case 'e':
						exponent_notation = true;
						break;
					default:
						break;
				}
				
				if(precission == -1)
					precission = 6;
				
				if( data_flags == 0 || check_flag(data_flags, data_modifiers[2].flag))
					ldtoa( va_arg(arguments, double), print_buffer, precission, exponent_notation);
				else if( check_flag(data_flags, data_modifiers[6].flag))
					ldtoa( va_arg(arguments, long double), print_buffer, precission, exponent_notation);
					
				format_number(print_buffer, format_flags, min_width, precission, base, (*format != 'e' || *format != 'E'), uppercase );


			}
			
			else if(*format == 'p')
			{
				lltoa( (int)va_arg(arguments, void*), print_buffer, 16);
				format_number(print_buffer, format_flags, min_width, precission, base, false, uppercase );
			}
			
			strcat(buffer, print_buffer);
			count += strlen(print_buffer);
			buffer = buffer + strlen(print_buffer);
			format++;
		}
		else
		{
			*buffer = *format;
			buffer++;
			count++;
			format++;
		}
		//Make sure that the string ends with a null char
		*buffer = 0;
	}

	return count;
}

void format_number(char* buffer, short format_flags, int min_width, int precission, int base , bool integer ,bool uppercase)
{
	//Append a + if the number is positive
	if( check_flag(format_flags, format_modifiers[1].flag) && !has_sign_char(buffer) )
		pad_with(buffer, '+', 1, true);
	

	//Pad with zeros if not enough precission
	if( precission > (int)strlen(buffer) )
	{
		char padding = precission - strlen(buffer);
		if(has_sign_char(buffer))
			pad_with(buffer + 1, '0', ++padding, integer);
		else
			pad_with(buffer , '0', padding, integer);
	}
	
	//Append a blank space at the beggining of the string if there is no sign characeter
	if( check_flag(format_flags, format_modifiers[2].flag) && !has_sign_char(buffer))
		pad_with(buffer, ' ', 1, true);
	
	//Adjust the size of the string
	adjust_size(buffer, format_flags, min_width);
	
	//Append a 0x to the bool numbers
	if( check_flag(format_flags, format_modifiers[3].flag) && base == 16 )
	{
		pad_with(buffer, 'x', 1, true);
		pad_with(buffer, '0', 1, true);
	}
	
	if(uppercase)
		string_to_upper(buffer);
}
fwrite

Code: Select all

int fputs(const char* string, FILE* stream)
{
    return fwrite( (void*)string, strlen(string), 1, stream);
}

Code: Select all

 * Description: Implementation of the fwrite finction that puts data on a stream
 */

#include <stdio.h>
#include <stddef.h>
#include <stdint.h>
#include <syscall.h>

/*
* Buffer, the data to write
* size, the number of bytes in each record
* number, the number of record to write
* stram, the output stream
*/
size_t fwrite(const void* buffer, size_t size, size_t number, FILE * stream)
{
    size_t left, written, return_value;

    if(!size || !number )
        return 0;
    
    left = size*number;
    written = 0;

    while(left > 0 && !stream->error)
    {
        return_value = write(stream->file_descriptor, buffer + written, left);

        if( return_value <= 0)
            stream->error = 1;
        else
        {
            left -= return_value;
            written += return_value;
        }

    }

    return (written / size);
}

Code: Select all

 * 
 * Description: System call to write to a file descriptor
 */

#include <stdio.h>
#include <syscall.h>
#include <stddef.h>

#if defined (__is_xeonos_kernel)
#include <kernel/tty.h>
#endif

size_t write(int file_descriptor, void* buffer, size_t number)
{
    if( file_descriptor < 0 || number == 0 || buffer == NULL)
        return 0;
#if defined (__is_xeonos_kernel)
    if(file_descriptor == 1)
    {
        terminal_write((char*)buffer, number);
        return number;
    }
#else
#endif

    return -1;
}
DevilGladiator
Posts: 11
Joined: Mon Aug 08, 2016 7:56 am

Re: Extrange function behaviour (optimizer, i guess)

Post by DevilGladiator »

Octocontrabass wrote:GCC understands standard C library functions and automatically substitutes them with simpler code (including entirely different functions!) when possible. Multiple printf() calls may be combined into a single one, printf() calls with a constant string ending in a newline might be replaced with puts(), and so on.
That was it! I forgot to append a new line char to the puts function, thank you!
gerryg400
Member
Member
Posts: 1801
Joined: Thu Mar 25, 2010 11:26 pm
Location: Melbourne, Australia

Re: Extrange function behaviour (optimizer, i guess)

Post by gerryg400 »

You could have saved a lot of time by reading Octocontrabass' reply. He knows these things. The bug is in your puts().

You have

Code: Select all

#include <stdio.h>

int	puts(const char* string)
{
    return fputs(string, stdout);
}
You need

Code: Select all

include <stdio.h>

int puts(const char *s)
{
    return fprintf(stdout, "%s\n", s);
}
or equivalent


EDIT: Today is the day for race conditions
If a trainstation is where trains stop, what is a workstation ?
brunexgeek
Member
Member
Posts: 45
Joined: Wed Dec 25, 2013 11:51 am

Re: Extrange function behaviour (optimizer, i guess)

Post by brunexgeek »

Octocontrabass wrote:GCC understands standard C library functions and automatically substitutes them with simpler code (including entirely different functions!) when possible. Multiple printf() calls may be combined into a single one, printf() calls with a constant string ending in a newline might be replaced with puts(), and so on.
I have some C library functions implemented, but not all. One way I avoid this "problem" is using the flag -fno-builtin or -fno-builtin-<function>

https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
Post Reply