Framebuffer: Draw a PSF character

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
User avatar
finarfin
Member
Member
Posts: 106
Joined: Fri Feb 23, 2007 1:41 am
Location: Italy & Ireland
Contact:

Framebuffer: Draw a PSF character

Post by finarfin »

Ok after all the good advice i got from my previous questions about framebuffer (thanks to whoeve replied), i'm nearly there with drawing a character.

At least i hope. So i'm following the tutorial on the wiki: https://wiki.osdev.org/PC_Screen_Font so far i have done:
  • Converted a PSF into a binary file and linked it in the kernel
  • Got access to the PSF data from the kernel
  • Implemented the function to draw the character on the screen
For the first two points everything is fine (i suppose), i can read the data in the PSF structure and get all the needed information from the PSF header, but the function to display the character is not working properly, apparently it display half of the character.

Here a screenshot:
screnshot.png
screnshot.png (1017 Bytes) Viewed 5382 times
On the left side what my kernel is displaying on the right side the character i'm tring to display.

The functtion is more or less the same in the wiki page (spent lot of time understanding what it does, was not a mere copy and paste :mrgreen: )

Code: Select all

extern char _binary_fonts_default_psf_size;
extern char _binary_fonts_default_psf_start;
extern char _binary_fonts_default_psf_end;
extern uint32_t FRAMEBUFFER_PITCH;
extern void *FRAMEBUFFER_MEM;

void _fb_putchar(unsigned short int symbol, int cx, int cy, uint32_t fg, uint32_t bg){
    _printStr("Inside fb_putchar");
    char *framebuffer = (char *) FRAMEBUFFER_MEM;
    PSF_font *default_font = (PSF_font*)&_binary_fonts_default_psf_start;
    uint32_t pitch = FRAMEBUFFER_PITCH;
    unsigned char *glyph = (unsigned char *)&_binary_fonts_default_psf_start + 
        default_font->headersize + (symbol>0&&symbol<default_font->numglyph?symbol:0) * default_font->bytesperglyph;
    int bytesperline =  (default_font->width + 7)/8;
    int offset = (cy * default_font->height * pitch) + 
        (cx * (default_font->width+1) * 4);
    
    int x, y, line, mask;

    for(y=0; y<default_font->height; y++){
        line = offset;
        mask = 1 << (default_font->width - 1);
        for(x=0; x<default_font->width; x++){
            *((uint32_t*) (framebuffer + line)) = ((int) *glyph) & mask ? fg : bg;
            mask >>= 1;
            line +=4;
        }
        glyph += bytesperline;
        offset +=pitch;
    }
}
The main difference is that i used the pitch from grub instead of scanline (that from what i understood should be the same thing)
The font i used is: Lat2-VGA28x16.psf (28x16)

This is the information that i read from the psf header (all values are HEX):

Code: Select all

Magic: 864AB572
Number of glyphs: 100
Header size: 20
Bytes per glyphs: 38
Flags: 1
Version: 0
Width: 10
Height: 1C
Adn the framebuffer header information are:

Code: Select all

Found Multiboot framebuffer: 8
---framebuffer-type: 1
---framebuffer-width: 400
---framebuffer-height: 300
---framebuffer-address: FD000000
---framebuffer-bpp: 20
---framebuffer-pitch: 1000
I tried to investigate but haven't figured out what the real issue is, and what i'm doing wrong. I'm wondering if that code works only with some specific sizes of font. Or i don't know what i'm doing wrong.

Can someone help me?

Btw i'm not sure if this is a typo in the code, or not but this line in the tutorial:

Code: Select all

            extern char *fb;
                         ......
            *((uint32_t*)(&fb + line)) = ((int)*glyph) & (mask) ? fg : bg;
if i use it as it is it doesn't display anything. It has to be:

Code: Select all

 *((uint32_t*)(fb + line)) = ((int)*glyph) & (mask) ? fg : bg;
Elen síla lúmenn' omentielvo
- DreamOS64 - My latest attempt with osdev: https://github.com/dreamos82/Dreamos64
- Osdev Notes - My notes about osdeving! https://github.com/dreamos82/Osdev-Notes
- My old Os Project: https://github.com/dreamos82/DreamOs
Octocontrabass
Member
Member
Posts: 5568
Joined: Mon Mar 25, 2013 7:01 pm

Re: Framebuffer: Draw a PSF character

Post by Octocontrabass »

All of those pointer casts are just begging for undefined behavior...

Anyway, this looks like the problem:

Code: Select all

((int) *glyph)
You're extracting a single char from the glyph array and then casting to int. Since char is only 8 bits, you only get 8 pixels. You probably meant to cast to pointer to int and then dereference, but that's a bad idea because incorrectly aligned pointers are undefined behavior.
User avatar
finarfin
Member
Member
Posts: 106
Joined: Fri Feb 23, 2007 1:41 am
Location: Italy & Ireland
Contact:

Re: Framebuffer: Draw a PSF character

Post by finarfin »

So the reason the tutorial is using a char i suppose is that it cares only about a single bit everytime, and since the length of a row is variable, it can't foresee what is the correct size.

So let's try to see if i understood the flow correctly:
  • The size of one glyph in pixel is given in the PSF structure (width and height)
  • Now every pixel has a certain number of bits, and this is given by the framebuffer bpp.
  • What i need to know is: how many pixels is one line, and this is the framebuffer pitch (or screenwidth * bpp), in this way once i rendere one line i can go directly one line below and start to rendere the next glyph row
  • And according to the function in the tutorial i need to know how many bytes is one line. And this is one of the thing that i'm struggling to understand. Since i know that one pixel size is given by BPP, i know also that the glyph widht is given by my_font->width, so my understanding is that the bytesperline value should be:

    Code: Select all

    font->width/bpp
    but in the tutorial it does:

    Code: Select all

    (font->width + 7)/8 
    and there are two things not totally clear the first is why the + 7? (is he trying to round to the nearest byte, but in this case shouldn't have been + 7 & ~7) and why /8?
  • And i need to get the glyph, in theory getting it should be straightforward it is given by: &_binary_fonts_default_psf_start + default_font->headersize + (symbol * default_font->bytesperglyph); This give me the location of the first row of the glyph number symbol
  • And i need the cursor starting position on the screen, that is given in "characters", so it has to be translated and what i would have done is:

    Code: Select all

    offset = (cy * (default_font->height * PITCH)) + (cx * (default_font->width) * (bpp/8)
    the tutorial does:

    Code: Select all

    int offs =  (cy * font->height * scanline) +(cx * (font->width+1) * 4);
    I think that the tutorial code assume that one pixel is 4 bytes (and with my bpp that is 32 i get the same result so is fine), not sure about the +1, maybe to add some space between characters?
  • The mask is pretty strightforward for every row iteration i start from the first position, and shift it one to the right for each column iteration, do the and and display or the foreground color if the result of the and is true or the bg color if false
  • Line for every new row starts for the current offset value, then while iterating the column it increases of 4 (still i think it assumes that every pixel is 4 bytes long),
  • once the column cycle is finished, offset and glyph needs to be updated, for the glyph i just need to add bytesperline for the offset in teory i just needo to increase it by the pitch value
So now said that, my understanding is that the size of one glyph row can be anything (in my case is 16) but if the font was 8x16 it whould have been 8..
And maybe this is why it use a char, for the font i'm using now the size is 16x28 so one row is 2 bytes, i suppose for the purpose of my test then i can use a uint16_t variable. I changed my code this way:

Code: Select all

uint16_t *glyph = (uint16_t *)&_binary_fonts_default_psf_start + 
        default_font->headersize + (symbol>0&&symbol<default_font->numglyph?symbol:0) * default_font->bytesperglyph;
// And then below to draw the pixel:
       *((uint32_t*) (framebuffer + line)) = ((uint16_t) *glyph) & mask ? fg : bg;
I got a different result this time, but still not what i want.
Screenshot_2021-03-12_10-02-14.png
Screenshot_2021-03-12_10-02-14.png (215 Bytes) Viewed 5340 times
As i said i was trying to follow the wiki tutorial, what should be a better way of doing it avoiding all the dereferencing?
Elen síla lúmenn' omentielvo
- DreamOS64 - My latest attempt with osdev: https://github.com/dreamos82/Dreamos64
- Osdev Notes - My notes about osdeving! https://github.com/dreamos82/Osdev-Notes
- My old Os Project: https://github.com/dreamos82/DreamOs
nexos
Member
Member
Posts: 1081
Joined: Tue Feb 18, 2020 3:29 pm
Libera.chat IRC: nexos

Re: Framebuffer: Draw a PSF character

Post by nexos »

Octocontrabass wrote:All of those pointer casts are just begging for undefined behavior...
Agreed. I personally try to limit how much pointer magic I do, as pointers can go south very quickly.
Anyway, I have used that code before. It works for me. I do think that it should have a rewrite, as it currently isn't very verbose for newcomers.
"How did you do this?"
"It's very simple — you read the protocol and write the code." - Bill Joy
Projects: NexNix | libnex | nnpkg
User avatar
finarfin
Member
Member
Posts: 106
Joined: Fri Feb 23, 2007 1:41 am
Location: Italy & Ireland
Contact:

Re: Framebuffer: Draw a PSF character

Post by finarfin »

nexos wrote:
Octocontrabass wrote:All of those pointer casts are just begging for undefined behavior...
Agreed. I personally try to limit how much pointer magic I do, as pointers can go south very quickly.
Anyway, I have used that code before. It works for me. I do think that it should have a rewrite, as it currently isn't very verbose for newcomers.
Do you remember what font do you used? because i think that the problems could be related to the font size. (i tried to change fonts but every time i find a psf1 font that i'm not supporting yet)

Yeah that tutorial is not much verbose i agree!!

Btw i did several tests, and what i can say is that it is displaying exactly 50% of the character, and also the first half of the glyph (the part that is displayed) is palced where the right part should be. This screenshot should show the problem more clearly.
Screenshot_2021-03-12_19-45-41.png
Screenshot_2021-03-12_19-45-41.png (563 Bytes) Viewed 5272 times
So i think is definetely a problem of calculate the right values. The font i'm using is 16x28, and i think usually other fonts are 8x16 os something similar? What do you think?
Last edited by finarfin on Fri Mar 12, 2021 1:49 pm, edited 1 time in total.
Elen síla lúmenn' omentielvo
- DreamOS64 - My latest attempt with osdev: https://github.com/dreamos82/Dreamos64
- Osdev Notes - My notes about osdeving! https://github.com/dreamos82/Osdev-Notes
- My old Os Project: https://github.com/dreamos82/DreamOs
Octocontrabass
Member
Member
Posts: 5568
Joined: Mon Mar 25, 2013 7:01 pm

Re: Framebuffer: Draw a PSF character

Post by Octocontrabass »

finarfin wrote:Now every pixel has a certain number of bits, and this is given by the framebuffer bpp.
Every pixel in the framebuffer has that many bits. Every pixel in the font is one bit. You're converting the glyph from 1bpp to the framebuffer bpp.
finarfin wrote:and there are two things not totally clear the first is why the + 7? (is he trying to round to the nearest byte, but in this case shouldn't have been + 7 & ~7) and why /8?
The font is 1bpp, so 8 pixels will fit in a single byte. To figure out how many bytes are in each row, you divide the width by 8, but you need to round up since each row is padded to the nearest byte.
finarfin wrote:not sure about the +1, maybe to add some space between characters?
Probably.
finarfin wrote:And maybe this is why it use a char, for the font i'm using now the size is 16x28 so one row is 2 bytes, i suppose for the purpose of my test then i can use a uint16_t variable.
If you do it that way, the bits will be in the wrong order. Keep using unsigned char (or uint8_t) and try something like this:

Code: Select all

    for(y=0; y<default_font->height; y++){
        line = offset;
        for(x=0; x<default_font->width; x++){
            *((uint32_t*) (framebuffer + line)) = glyph[x/8] & (0x80 >> (x & 7)) ? fg : bg;
            line +=4;
        }
        glyph += bytesperline;
        offset +=pitch;
    }
finarfin wrote:As i said i was trying to follow the wiki tutorial, what should be a better way of doing it avoiding all the dereferencing?
Dereferencing is fine. Casting pointers from one type to another is what you should be avoiding. Create your variables with the appropriate type to begin with. (For example, declare "framebuffer" as a pointer to uint32_t if appropriate, or declare it as a pointer to uint8_t and write each byte separately.)
User avatar
finarfin
Member
Member
Posts: 106
Joined: Fri Feb 23, 2007 1:41 am
Location: Italy & Ireland
Contact:

Re: Framebuffer: Draw a PSF character

Post by finarfin »

Octocontrabass wrote: If you do it that way, the bits will be in the wrong order. Keep using unsigned char (or uint8_t) and try something like this:

Code: Select all

    for(y=0; y<default_font->height; y++){
        line = offset;
        for(x=0; x<default_font->width; x++){
            *((uint32_t*) (framebuffer + line)) = glyph[x/8] & (0x80 >> (x & 7)) ? fg : bg;
            line +=4;
        }
        glyph += bytesperline;
        offset +=pitch;
    }
It worked! Thank you very much!! :) And thanks for all the detailed answers!
Elen síla lúmenn' omentielvo
- DreamOS64 - My latest attempt with osdev: https://github.com/dreamos82/Dreamos64
- Osdev Notes - My notes about osdeving! https://github.com/dreamos82/Osdev-Notes
- My old Os Project: https://github.com/dreamos82/DreamOs
User avatar
bzt
Member
Member
Posts: 1584
Joined: Thu Oct 13, 2016 4:55 pm
Contact:

Re: Framebuffer: Draw a PSF character

Post by bzt »

finarfin wrote:I tried to investigate but haven't figured out what the real issue is, and what i'm doing wrong. I'm wondering if that code works only with some specific sizes of font. Or i don't know what i'm doing wrong.
Nope, it should work with widths up to 32 bit. But it had a typo in it, which is now fixed.
Octocontrabass wrote:All of those pointer casts are just begging for undefined behavior...
All of those pointer casts are very much necessary, because pitch is in bytes and not in pixels, plus this way modifying the code to hicolor framebuffer is a simple matter of changing the cast in ONE place. Not to mention you don't know if font rows needs to be accessed as byte, word or dword. Pointer casting is the most efficient and effective way to handle that.
finarfin wrote:Btw i'm not sure if this is a typo in the code, or not but this line in the tutorial:
Depends how you define the address. If it's a static address from the linker for example, then you need &fb. If it's a pointer, then no need for &.
finarfin wrote:I think that the tutorial code assume that one pixel is 4 bytes (and with my bpp that is 32 i get the same result so is fine), not sure about the +1, maybe to add some space between characters?
Yes, it assumes 4 bytes as it's written for ARGB 8+8+8+8 framebuffer format. And yes, that +1 needs to be added for separating the characters. Without Russian cyrillic letters became unreadable. BTW, this is exactly what the original VGA hardware did. It used 8x16 fonts, but actually displayed those as 9x16. The 9th coloumn was cleared, except for the box drawing characters which copied the 8th coloumn.
Octocontrabass wrote:If you do it that way, the bits will be in the wrong order. Keep using unsigned char (or uint8_t) and try something like this:
Yeah works, but extremely inefficient. The original code had a single shift instruction. Now you have shift, a logical and, and even a division for each and every pixel!
Octocontrabass wrote:Casting pointers from one type to another is what you should be avoiding.
Why? Because they allow you to write efficient and compact code?

I've changed the code in the following:
1. since the fb is defined as extern char *fb, I've removed &
2. I've added a #define PIXEL uint32_t
3. I've replaced 4 with sizeof(PIXEL) in the offs calculation
4. I've fixed the pointer cast (actually it should have been a pointer cast, but it wasn't, it was a value cast, that was the typo causing trouble for @finarfin)

Now adapt the code to other video modes (like hi-color), one would only need to change the PIXEL define.

Cheers,
bzt
Octocontrabass
Member
Member
Posts: 5568
Joined: Mon Mar 25, 2013 7:01 pm

Re: Framebuffer: Draw a PSF character

Post by Octocontrabass »

bzt wrote:Pointer casting is the most efficient and effective way to handle that.
Except that C requires correct pointer alignment and forbids aliasing. A misaligned or aliased pointer is undefined behavior.
bzt wrote:
Octocontrabass wrote:Casting pointers from one type to another is what you should be avoiding.
Why? Because they allow you to write efficient and compact code?
Because C requires correct pointer alignment and forbids aliasing, and it's very difficult to guarantee your code meets those requirements when using pointer casts.
bzt wrote:4. I've fixed the pointer cast (actually it should have been a pointer cast, but it wasn't, it was a value cast, that was the typo causing trouble for @finarfin)
The cast to unsigned int* sometimes results in a misaligned pointer, so the behavior is undefined. Also, the bits are still in the wrong order, so if your compiler decides to allow the misaligned pointer, it still won't work with fonts wider than 8 pixels.
User avatar
bzt
Member
Member
Posts: 1584
Joined: Thu Oct 13, 2016 4:55 pm
Contact:

Re: Framebuffer: Draw a PSF character

Post by bzt »

Octocontrabass wrote:Because C requires correct pointer alignment and forbids aliasing
That ain't true. With "-ansi -Wall -Wextra -pendantic" all compilers allow both unaligned pointers and pointer casts without any warning. No "requires" neither "forbidding" of any kind. And the x86 is perfectly capable of reading from unaligned addresses, so the compiled code is valid and works.
Octocontrabass wrote:and it's very difficult to guarantee your code meets those requirements when using pointer casts.
And how do you implement memory access efficiently when you only know the actual size in run-time?
Octocontrabass wrote:The cast to unsigned int* sometimes results in a misaligned pointer, so the behavior is undefined.
Misaligned pointer is no UB on x86. Portable? No. Does it need to be portable? That's a no again. (Just for the records, besides of x86 it works on ARM too.)
Octocontrabass wrote:Also, the bits are still in the wrong order
If that were true, then glyphs would be mirrored even with 8 pixel wide fonts. But to change the bit order, one would only need to do:

Code: Select all

mask = 1;
...
mask <<= 1;
I guess you might wanted to write byte order instead of bit order perhaps? I'm not so sure about that either, but at least there's a chance that that could be true.

Cheers,
bzt
Octocontrabass
Member
Member
Posts: 5568
Joined: Mon Mar 25, 2013 7:01 pm

Re: Framebuffer: Draw a PSF character

Post by Octocontrabass »

bzt wrote:That ain't true. With "-ansi -Wall -Wextra -pendantic" all compilers allow both unaligned pointers and pointer casts without any warning. No "requires" neither "forbidding" of any kind.
A feature to report any failure to conform to ISO C might be useful in some instances, but would require considerable additional work and would be quite different from -Wpedantic. We don’t have plans to support such a feature in the near future.
In other words, you can't rely on compiler warnings to verify your code.
bzt wrote:And the x86 is perfectly capable of reading from unaligned addresses, so the compiled code is valid and works.
Even if we assume the target is x86 (which it may not be), there are some x86 instructions that can't read from unaligned addresses, so the compiled code may not be valid and may not work.
bzt wrote:And how do you implement memory access efficiently when you only know the actual size in run-time?
Add three bytes of padding to the end of the font array, then unconditionally read four bytes and shuffle them into a uint32_t variable in the correct order. Clang is smart enough to optimize this into BSWAP or MOVBE depending on what's available. (GCC needs a bit of work.) Then, go back to using a mask variable with a single shift per loop iteration.
bzt wrote:Misaligned pointer is no UB on x86.
That's irrelevant. You're not writing x86 code, you're writing C code. It's undefined behavior in C, so the C compiler can do whatever it wants. (Refer to section 6.3.2.3 paragraph 7 in this draft of the C17 standard.) Maybe the next time you update your C compiler it will detect the misaligned pointer and emit a UD2 instruction.
User avatar
bzt
Member
Member
Posts: 1584
Joined: Thu Oct 13, 2016 4:55 pm
Contact:

Re: Framebuffer: Draw a PSF character

Post by bzt »

Octocontrabass wrote:
bzt wrote:That ain't true. With "-ansi -Wall -Wextra -pendantic" all compilers allow both unaligned pointers and pointer casts without any warning. No "requires" neither "forbidding" of any kind.
A feature to report any failure to conform to ISO C might be useful in some instances, but would require considerable additional work and would be quite different from -Wpedantic. We don’t have plans to support such a feature in the near future.
In other words, you can't rely on compiler warnings to verify your code.
In other words, the compiler does not "require" nor "forbid" anything. Just as I said. You're free to use pointer casting.

Here's another example. Zlib, which has been compiled on an armada of platforms for god knows how many OSes, also uses pointer casting to speed up CRC calculation. Is zlib a badly written code because of that? I don't think so.
Octocontrabass wrote:Even if we assume the target is x86 (which it may not be), there are some x86 instructions that can't read from unaligned addresses, so the compiled code may not be valid and may not work.
Wrong. First, we can safely assume x86, second this code was never designed to be portable as I have said (but it works on ARM too as I have said), third MOV is not a special instruction requiring alignment, fourth generated code accessing unaligned address is valid and does work.

By your argument you must never use paging nor variadic arguments in your C code because there are architectures that do not support those. Do not use open(), read(), write(), close() either, because under Windows you don't have those. And never use string literals longer than 509 bytes because there are C compilers that can't handle that. Don't declare a variable in a middle of a code block because there are C compilers that do not support those. Shall I continue...?
Octocontrabass wrote:Add three bytes of padding to the end of the font array
You are reading in the middle of a bitchunk, not at the end. Are you suggesting copying memory over and over again? How efficient that could be I wonder?

Homework for you: here's my RLE16 decoder using pointer casting and no memory copy for 16 bit elements (about 20 SLoC), compare its complexity and performance with any of the other existing RLE solutions not using pointer casting (for example this one, which uses memcpy to aligned temporary buffer to assure alignment). You'll be surprised to learn the difference. Seriously, I mean it, compile both and measure their performance!
Octocontrabass wrote:
bzt wrote:Misaligned pointer is no UB on x86.
That's irrelevant.
No, it isn't "irrelevant", it is definitely and very much relevant. Why not use a feature that all the potential target CPUs support and the compiler allows?
Octocontrabass wrote:You're not writing x86 code, you're writing C code.
That's your problem! You are not writing code for a beauty contest, this isn't IOCCC, but in a kernel you're writing low level code, which means your priority is the generated code. If you think otherwise, then you're just an application programmer, not a kernel programmer (no offense intended).

The reasons why you (in plural) are writing such bad and inefficient code is because you never think about how it's going to be executed on the CPU. I do. That's why my model loader easily outperforms the so called "fastest" tinyobjload 3 times over for example.
Octocontrabass wrote:Maybe the next time you update your C compiler it will detect the misaligned pointer and emit a UD2 instruction.
Never gonna happen. Forbidding a feature that the target supports and more than 50 years of code relying on would be a suicide on the compiler part. If a compiler would really disallow unaligned access, ALL the FAT implementations would broke for example (because BPB struct must be packed and its fields aren't properly aligned). But the examples are countless, what if the CRC checksum does not appear on an aligned address in a gzip header? Etc.

Cheers,
bzt
sham1
Posts: 16
Joined: Wed Nov 30, 2011 12:44 pm
Libera.chat IRC: sham1

Re: Framebuffer: Draw a PSF character

Post by sham1 »

bzt wrote:
Octocontrabass wrote:
bzt wrote:That ain't true. With "-ansi -Wall -Wextra -pendantic" all compilers allow both unaligned pointers and pointer casts without any warning. No "requires" neither "forbidding" of any kind.
A feature to report any failure to conform to ISO C might be useful in some instances, but would require considerable additional work and would be quite different from -Wpedantic. We don’t have plans to support such a feature in the near future.
In other words, you can't rely on compiler warnings to verify your code.
In other words, the compiler does not "require" nor "forbid" anything. Just as I said. You're free to use pointer casting.
It's still undefined behavior and violates the assumptions of the C abstract machine. Just because it happens to behave properly doesn't make it good.
bzt wrote:Here's another example. Zlib, which has been compiled on an armada of platforms for god knows how many OSes, also uses pointer casting to speed up CRC calculation. Is zlib a badly written code because of that? I don't think so.
Where? At least not in crc32.c where one would expect to find it.
bzt wrote:Homework for you: here's my RLE16 decoder using pointer casting and no memory copy for 16 bit elements (about 20 SLoC), compare its complexity and performance with any of the other existing RLE solutions not using pointer casting (for example this one, which uses memcpy to aligned temporary buffer to assure alignment). You'll be surprised to learn the difference. Seriously, I mean it, compile both and measure their performance!
I took this as a bit of a challenge, so I came up with this. Now, I haven't profiled either code fragment, so I don't know how my implementation stacks up against your impl (nor have I tested if it works, but it should).
bzt wrote:That's your problem! You are not writing code for a beauty contest, this isn't IOCCC, but in a kernel you're writing low level code, which means your priority is the generated code. If you think otherwise, then you're just an application programmer, not a kernel programmer (no offense intended).
If you want to ensure that generated code is a certain way, I'd honestly urge you to go with Assembly instead, since the C optimizer can do whatever with code which has undefined behavior. You can't just assume that it'll work, even though it very well might.

Also, what is a kernel if not an application?
bzt wrote:The reasons why you (in plural) are writing such bad and inefficient code is because you never think about how it's going to be executed on the CPU. I do. That's why my model loader easily outperforms the so called "fastest" tinyobjload 3 times over for example.
It's not for us to think about. That's the job of the optimizer. And as said, if you really care about just how the code is to be executed, just write Assembly.

Also, not everything is performance critical. And even if whatever you're (plural) doing really is that critical, often things like readability, maintainability and so on are also things to consider.
bzt wrote:
Octocontrabass wrote:Maybe the next time you update your C compiler it will detect the misaligned pointer and emit a UD2 instruction.
Never gonna happen. Forbidding a feature that the target supports and more than 50 years of code relying on would be a suicide on the compiler part. If a compiler would really disallow unaligned access, ALL the FAT implementations would broke for example (because BPB struct must be packed and its fields aren't properly aligned). But the examples are countless, what if the CRC checksum does not appear on an aligned address in a gzip header? Etc.

Cheers,
bzt
A lot of the time a compiler can optimize things like memcpy away because it understands both the semantics of that standard function, as well as the target platform, meaning that in many cases said compiler can replace the memcpy with just move instructions of various kinds. Meaning that you should also be able to get reasonable performance even if you copy things like BPB fields or the aforementioned CRCs into actually aligned buffers (since even on x86, even though unaligned reads are possible, they're still slower than aligned reads).

Because you're right. The code we write will end up running on x86, ARM or whatever. But you're not writing x86 or ARM code, you're writing C code. And in C, unaligned reads and writes are undefined behavior. Your program is straight-up broken and invalid as far as the standard and compiler is concerned. It just happens that for things like x86, it works out. But assuming that it'll work everywhere is like assuming that every architecture out there is little-endian.

EDIT: Forgot restrict from my godbolt'd RLE code.
Last edited by sham1 on Sat Mar 13, 2021 9:41 am, edited 1 time in total.
Octocontrabass
Member
Member
Posts: 5568
Joined: Mon Mar 25, 2013 7:01 pm

Re: Framebuffer: Draw a PSF character

Post by Octocontrabass »

bzt wrote:In other words, the compiler does not "require" nor "forbid" anything. Just as I said. You're free to use pointer casting.
And the compiler is free to silently generate code that does not work.
bzt wrote:Here's another example. Zlib, which has been compiled on an armada of platforms for god knows how many OSes, also uses pointer casting to speed up CRC calculation. Is zlib a badly written code because of that? I don't think so.
Do you mean this code, which is documented to explicitly rely on correct pointer alignment and relaxed aliasing rules and is only conditionally enabled so it can be avoided when those conditions can't be met? No, because the author put in the work necessary to avoid relying on undefined behavior.
bzt wrote:First, we can safely assume x86,
Why do you think this is a safe assumption on a forum where there are regular discussions about OS development on non-x86 platforms?
bzt wrote:second this code was never designed to be portable
Why not? There's nothing about the PSF format that makes it architecture-specific, and there's nothing on the wiki page that says the code is architecture-specific.
bzt wrote:third MOV is not a special instruction requiring alignment,
What about MOVDQA? MOVAPS?
bzt wrote:fourth generated code accessing unaligned address is valid and does work.
"It works" does not mean "it will always work". "It works" does not mean "the behavior is defined".
bzt wrote:By your argument you must never use paging nor variadic arguments in your C code because there are architectures that do not support those. Do not use open(), read(), write(), close() either, because under Windows you don't have those. And never use string literals longer than 509 bytes because there are C compilers that can't handle that. Don't declare a variable in a middle of a code block because there are C compilers that do not support those. Shall I continue...?
My argument is "you should not rely on undefined behavior." The only things you've listed that could be undefined behavior are string literals longer than 509 characters and mixed declarations and code, and only in the C89 standard. C99 has been around for 20 years, perhaps it's time for you to update.
bzt wrote:You are reading in the middle of a bitchunk, not at the end. Are you suggesting copying memory over and over again?
No, of course not. You would add those three bytes to the end of the PSF file when including it into your kernel, not at runtime.
bzt wrote:Homework for you: here's my RLE16 decoder using pointer casting and no memory copy for 16 bit elements (about 20 SLoC), compare its complexity and performance with any of the other existing RLE solutions not using pointer casting (for example this one, which uses memcpy to aligned temporary buffer to assure alignment). You'll be surprised to learn the difference. Seriously, I mean it, compile both and measure their performance!
Provide your own benchmarks first. Here, I'll even give you a fair comparison by rewriting your implementation to remove the unaligned pointer casts. (Behavior may still be undefined if the input is malformed.)
bzt wrote:Why not use a feature that all the potential target CPUs support and the compiler allows?
Where in the GCC documentation does it say that pointers do not have to be aligned?
bzt wrote:That's your problem! You are not writing code for a beauty contest, this isn't IOCCC, but in a kernel you're writing low level code, which means your priority is the generated code. If you think otherwise, then you're just an application programmer, not a kernel programmer (no offense intended).
My priority is the generated code. That's why avoiding undefined behavior is so important: the compiler makes no guarantees about the code it will generate for undefined behavior.
bzt wrote:Never gonna happen. Forbidding a feature that the target supports and more than 50 years of code relying on would be a suicide on the compiler part. If a compiler would really disallow unaligned access, ALL the FAT implementations would broke for example (because BPB struct must be packed and its fields aren't properly aligned).
Packed structs are implementation-defined behavior. Implementation-defined behavior is not undefined behavior. We are talking about undefined behavior.
User avatar
bzt
Member
Member
Posts: 1584
Joined: Thu Oct 13, 2016 4:55 pm
Contact:

Re: Framebuffer: Draw a PSF character

Post by bzt »

Octocontrabass wrote:And the compiler is free to silently generate code that does not work.
Interesting, but how does it generate non-working code if it forbids aliasing in the first place? I thought you were saying that. (And a compiler silently generating non-working code is a buggy compiler, period.)
Octocontrabass wrote:Why do you think this is a safe assumption on a forum where there are regular discussions about OS development on non-x86 platforms?
Name at least one architecture that newcomers in this community might use besides of x86 (and maybe ARM for a very few enthusiastics).
Octocontrabass wrote:
bzt wrote:second this code was never designed to be portable
Why not?
Because it's a code example on a newcomer's tutorial page, not a fully blown, ready to be compiled kernel source tree. I haven't seen you complaining about Babystep tutorial not being portable for example.
Octocontrabass wrote:
bzt wrote:third MOV is not a special instruction requiring alignment,
What about MOVDQA? MOVAPS?
What about those? Not the same instruction as MOV, totally different opcodes. And what about MOVUPS? And is a properly working compiler allowed to emit any of those instructions for this C code? I don't think so. (Actually I had a lot of trouble to make gcc emit MOVAPS even when I explicitly wanted it in my optimized SLERP code. I ended up using intrinsics instead because no, no optimizer smart enough to figure out using packed scalars on its own.)
Octocontrabass wrote:C99 has been around for 20 years, perhaps it's time for you to update.
Why? C89 specification won't change in the future that's for sure and it's very well known how a compiler should compile it, with all the side effects and quircks. Otherwise don't tell me what language to use, I don't tell you to use C++, D or Ada for example either (been around more than 20 years too).
Octocontrabass wrote:
bzt wrote:You are reading in the middle of a bitchunk, not at the end. Are you suggesting copying memory over and over again?
No, of course not. You would add those three bytes to the end of the PSF file when including it into your kernel, not at runtime.
And I must ask again, how would adding 3 bytes at the end help you when you need to read 4 bytes from the middle of bitchunk? Like accessing a CRC value at the end of a gzip header, before the compressed data?
Octocontrabass wrote:Provide your own benchmarks first.
Nope. I've told you a way how to check the validity of my statements and see for yourself. You wouldn't believe any benchmarks from me anyway, so proving is on you.
Octocontrabass wrote:
bzt wrote:Why not use a feature that all the potential target CPUs support and the compiler allows?
Where in the GCC documentation does it say that pointers do not have to be aligned?
You make absolutely no sense. How is that an answer to my question at all? Which C compiler doesn't support them? GCC definitely does.
Octocontrabass wrote:My priority is the generated code.
Nope, you made it pretty clear that your priority is the language, with imaginary restrictions that no actual x86 compiler was restricted to in the last 30 years (and won't be in the future).

You know what? Keep writing bad and inefficient C code, not making out the most of the target architecture. Who cares? I'll be happy with my lightning fast code and I'll add a few #ifdefs if and when I port my code to some strange architecture.

Have a nice day,
bzt
Post Reply