Exception when try to print text

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
User avatar
finarfin
Member
Member
Posts: 106
Joined: Fri Feb 23, 2007 1:41 am
Location: Italy & Ireland
Contact:

Exception when try to print text

Post by finarfin »

Here i am, at one of the worst nightmare when osdeving...

everything is working fine, then you add a new line of code on whatever you have done so far, and *sbam* you start to see lot of cpu exceptions going on...

And then you try to do some debug and nope you can't figure out where is the bug, becauise everything looks fine.

So here is the problem:

my code was working (at least that is what i thought) so far, and recently i was doing some refactoring of basic screen i/o, just to make the codoe more readable, i basically added a _printStringAndNumber function that it just print the string, convert the number given in input to a string (max 30 chars), and then print that number string and finally a new line.

This is the code:

Code: Select all

void _printStringAndNumber(char *string, unsigned long number){
    char buffer[30];
    _printStr(string);
    _getHexString(buffer, number);
    _printStr(buffer);
    _printNewLine();
}
And it worked pretty well since i added it. The function that converts to hex is the following:

Code: Select all

int _getHexString(char *buffer, unsigned long hexnumber){
    unsigned long tmpnumber = hexnumber;
    int shift = 0;
    int size = 0;
    char *hexstring = buffer;
    while (tmpnumber >= 16){
        tmpnumber >>= 4;
        shift++;
    }
    size = shift;
 
    for(; shift >=0; shift--){
        tmpnumber = hexnumber;
        tmpnumber>>=(4*shift);
        tmpnumber&=0xF;

        if(tmpnumber < 10){
            *hexstring++ = '0' + tmpnumber; //Same as decimal number
        } else {
            *hexstring++ = 'A' + tmpnumber-10; //11-15 Are letters
        }
        *hexstring = '\0';
    }
    return size;
}
And probably the functions above are not perfect, but i can't see any major issue there.

Before adding them to print a number i should have done something like:

Code: Select all

char buffer[30]
uint32_t number = 125
_getHexString(buffer, number)
_printStr("Number is: ");
_printStr(number);
_printNewLine();
Pretty boring, so i added to temporarily add this function to make code readable (And probably once done basic with memory and PF handling add proper printf/puts etc functons.
Btw i started to replace the snippet above with the new function that was doing the same thing and everything was still working, until yesterday.

When i started to get exception raised, most of the time i got an Invalid Opcode exception, and few other times (depending on what i was changing in the code) i got a #PF.

So far this is what i found/current information:
  • For some reason it is appening only after i add a cpuid instruction (no matter if is a function i wrote in assembly, that is pretty shitty) or call just cpuid with (asm("cpuid)
  • If i don't call cpuid exceptions are not raised and i tried many printStringAndNumber functions and no exception was raised
  • Not sure if the two problems are related but when i tried to implement a basic printf like function, with the va_list, va_arg stuff, i got a similar problem on the second time i was trying to call it, if i'm not wrong was with a #pf but in this case i assumed that issue could have been that the va_* macros were messing around with pointers to access parameters so i was going to hit an unmapped region
  • I'm running the kernel in the higher half, but in theory the video ram should have been correctly mapped, this is the boot mapping code for the kernel:

    Code: Select all

        mov eax, p2_table - KERNEL_VIRTUAL_ADDR
        or eax, 0b11
        mov dword[(p3_table_hh - KERNEL_VIRTUAL_ADDR) + 510 * 8], eax
    
        ; Now let's prepare a loop...
        mov ecx, 0  ; Loop counter
        
        .map_p2_table:
            mov eax, 0x200000   ; Size of the page
            mul ecx             ; Multiply by counter
            or eax, 0b10000011 ; We set: huge page bit, writable and present 
    
            ; Moving the computed value into p2_table entry defined by ecx * 8
            ; ecx is the counter, 8 is the size of a single entry
            mov [(p2_table - KERNEL_VIRTUAL_ADDR) + ecx * 8], eax
    
            inc ecx             ; Let's increase ecx
            cmp ecx, 512        ; have we reached 512 ?
                                ; each table is 4k size. Each entry is 8bytes
                                ; that is 512 entries in a table
            
            jne .map_p2_table   ; if ecx < 512 then loop
    
    where p3_table_hh is the p3 table of the kernel in the higher half, and i'm using actually 2mb pages, so i have mapped something like 1gb of memory!
    * Actually i'm coding stuff that has basic support for framebuffer and vga, so i thought that probably printiung on the VGA while not using it could have been a problem, so commented that part when using the framebuffer, but stiill same error, and switched back to vga mode and sill same problem (but with a nice crazy screen after a while... that i will probably post in the when your os is going crazy topic)
    * Another interesting thing is that if i replace that last printSringAndNumber, with the old way of printing numbers, this exception is not happening.
    * I though the problem was that i was using the mapped VGA memory in the higher half, so tried to revert back to the old one, but again without luck.,
As i said i got also a #pf in couple of cases but i don't remember exactly what was the scenario.
Whatever i tried so far is not working (i thought that maybe since i was in framebuffer mode, and still calling the _printCh function:

Code: Select all

void _printCh(char c, character_color color){
    *VIDEO_PTR++ = c;
    *VIDEO_PTR++ = color;
}
and that for some reason was going somewhere outside its boundaries, so disabled it when in framebuffer mode, and yet with no luck.

Can be the fact that whenever i'm calling the _printStringAndNumber function i'm allocating a 30 char buffer? (even if i don't see why it should be a problem).

So my question is if someone can help me investigating further to try to understand how to find the root of the problem.
If needed the full code of the os is here: https://github.com/dreamos82/Dreamos64
P.s. the _printStringAndNumber function is kind of temporary until i implement the basic printing C functions.
P.P.s. The branch with the exception: https://github.com/dreamos82/Dreamos64/ ... n_on_print
Elen síla lúmenn' omentielvo
- DreamOS64 - My latest attempt with osdev: https://github.com/dreamos82/Dreamos64
- Osdev Notes - My notes about osdeving! https://github.com/dreamos82/Osdev-Notes
- My old Os Project: https://github.com/dreamos82/DreamOs
thewrongchristian
Member
Member
Posts: 426
Joined: Tue Apr 03, 2018 2:44 am

Re: Exception when try to print text

Post by thewrongchristian »

finarfin wrote:When i started to get exception raised, most of the time i got an Invalid Opcode exception, and few other times (depending on what i was changing in the code) i got a #PF.

So far this is what i found/current information:
  • For some reason it is appening only after i add a cpuid instruction (no matter if is a function i wrote in assembly, that is pretty shitty) or call just cpuid with (asm("cpuid)
  • If i don't call cpuid exceptions are not raised and i tried many printStringAndNumber functions and no exception was raised
Without seeing how cpuid is implemented, this is a guess.

Do you save $ecx in your function, or indicate to the compiler that $ecx is clobbered if using inline assembly? cpuid updates $ecx, which is a callee saved register, so if your cpuid function (or inline asm) isn't saving that register, that might explain why code is getting confused.
User avatar
finarfin
Member
Member
Posts: 106
Joined: Fri Feb 23, 2007 1:41 am
Location: Italy & Ireland
Contact:

Re: Exception when try to print text

Post by finarfin »

thewrongchristian wrote:
finarfin wrote:When i started to get exception raised, most of the time i got an Invalid Opcode exception, and few other times (depending on what i was changing in the code) i got a #PF.

So far this is what i found/current information:
  • For some reason it is appening only after i add a cpuid instruction (no matter if is a function i wrote in assembly, that is pretty shitty) or call just cpuid with (asm("cpuid)
  • If i don't call cpuid exceptions are not raised and i tried many printStringAndNumber functions and no exception was raised
Without seeing how cpuid is implemented, this is a guess.

Do you save $ecx in your function, or indicate to the compiler that $ecx is clobbered if using inline assembly? cpuid updates $ecx, which is a callee saved register, so if your cpuid function (or inline asm) isn't saving that register, that might explain why code is getting confused.
The implemented cpuid functions are:

Code: Select all

section .text
global _cpuid_model
_cpuid_model:
    [bits 64]
    mov eax, 0x0
    cpuid
    mov [processor_string], ebx
    mov [processor_string + 4], edx
    mov [processor_string + 8], ecx
    mov byte[processor_string + 12], '\0'
    mov rax, processor_string
    ret

;Maybe just return edx in the future
;(ignoring ecx for now)
global _cpuid_feature_apic
_cpuid_feature_apic:
    mov eax, 0x1
    cpuid
    and edx, 0x100
    mov eax, edx
    ret
   
   
section .bss
processor_string:
    resb 13


Both functions are very draft, i stopped implementing them because i wanted to focus on something else before going back to them,
Btw i tried removing the call to _cpuid_feature_apic, and now i get a #PF no longer the invalid opcode. and the values that i read are:

Code: Select all

0 (error flags)
68747541 (address)
Page Fault
I did several tests, and tried to remove move the string concatenataion in the asm file, removing also the variable declaration, tried to replace all e* registers with r*, tried to print only few characters, getting always the same error.
Whatever i try to print after the cpuid, i stsart to get this weird behaviour.
Elen síla lúmenn' omentielvo
- DreamOS64 - My latest attempt with osdev: https://github.com/dreamos82/Dreamos64
- Osdev Notes - My notes about osdeving! https://github.com/dreamos82/Osdev-Notes
- My old Os Project: https://github.com/dreamos82/DreamOs
User avatar
finarfin
Member
Member
Posts: 106
Joined: Fri Feb 23, 2007 1:41 am
Location: Italy & Ireland
Contact:

Re: Exception when try to print text

Post by finarfin »

Ok i updated the cpui.s function, now before calling it i save rax, rcx, rbx, rdx, and pop them at the end, now looks like the error is gone!

This is how now the function looks like:

Code: Select all

_cpuid_model:
    mov eax, 0x0
    push rax
    push rcx
    push rbx
    push rdx
    cpuid
    mov [processor_string], ebx
    mov [processor_string + 4], edx
    mov [processor_string + 8], ecx
    mov byte[processor_string + 12], '\0'
    mov rax, processor_string
    pop rdx
    pop rbx
    pop rcx
    pop rax
    ret

thanks! :) I think that it was the issue!
Elen síla lúmenn' omentielvo
- DreamOS64 - My latest attempt with osdev: https://github.com/dreamos82/Dreamos64
- Osdev Notes - My notes about osdeving! https://github.com/dreamos82/Osdev-Notes
- My old Os Project: https://github.com/dreamos82/DreamOs
thewrongchristian
Member
Member
Posts: 426
Joined: Tue Apr 03, 2018 2:44 am

Re: Exception when try to print text

Post by thewrongchristian »

finarfin wrote:
thewrongchristian wrote:
finarfin wrote:When i started to get exception raised, most of the time i got an Invalid Opcode exception, and few other times (depending on what i was changing in the code) i got a #PF.

So far this is what i found/current information:
  • For some reason it is appening only after i add a cpuid instruction (no matter if is a function i wrote in assembly, that is pretty shitty) or call just cpuid with (asm("cpuid)
  • If i don't call cpuid exceptions are not raised and i tried many printStringAndNumber functions and no exception was raised
Without seeing how cpuid is implemented, this is a guess.

Do you save $ecx in your function, or indicate to the compiler that $ecx is clobbered if using inline assembly? cpuid updates $ecx, which is a callee saved register, so if your cpuid function (or inline asm) isn't saving that register, that might explain why code is getting confused.
The implemented cpuid functions are:

Code: Select all

section .text
global _cpuid_model
_cpuid_model:
    [bits 64]
    mov eax, 0x0
    cpuid
    mov [processor_string], ebx
    mov [processor_string + 4], edx
    mov [processor_string + 8], ecx
    mov byte[processor_string + 12], '\0'
    mov rax, processor_string
    ret

;Maybe just return edx in the future
;(ignoring ecx for now)
global _cpuid_feature_apic
_cpuid_feature_apic:
    mov eax, 0x1
    cpuid
    and edx, 0x100
    mov eax, edx
    ret
   
   
section .bss
processor_string:
    resb 13


Both functions are very draft, i stopped implementing them because i wanted to focus on something else before going back to them,
Btw i tried removing the call to _cpuid_feature_apic, and now i get a #PF no longer the invalid opcode. and the values that i read are:

Code: Select all

0 (error flags)
68747541 (address)
Page Fault
I did several tests, and tried to remove move the string concatenataion in the asm file, removing also the variable declaration, tried to replace all e* registers with r*, tried to print only few characters, getting always the same error.
Whatever i try to print after the cpuid, i stsart to get this weird behaviour.
The #PF address, if you interpret it as ASCII characters, gives "Auth", which is the first 4 characters of the "AuthenticAMD" (I'm guessing you have an AMD processor?) string deposited in ebx.

I got the register convention wrong, $ecx is a caller saved register, it's $ebx that is causing the problem here, as that should be callee saved, but in this case is being overwritten by the CPUID instruction with the "Auth" bit of "AuthenticAMD".

So you remedy here is to save and restore $ebx in your CPUID functions.
User avatar
finarfin
Member
Member
Posts: 106
Joined: Fri Feb 23, 2007 1:41 am
Location: Italy & Ireland
Contact:

Re: Exception when try to print text

Post by finarfin »

thewrongchristian wrote: The #PF address, if you interpret it as ASCII characters, gives "Auth", which is the first 4 characters of the "AuthenticAMD" (I'm guessing you have an AMD processor?) string deposited in ebx.

I got the register convention wrong, $ecx is a caller saved register, it's $ebx that is causing the problem here, as that should be callee saved, but in this case is being overwritten by the CPUID instruction with the "Auth" bit of "AuthenticAMD".

So you remedy here is to save and restore $ebx in your CPUID functions.
Why only $ebx? The result is stored in ebx, ecx,edx, why i should save only ebx? And i change also the eax status.

Btw i'm using qemu that for some reason is returning "Authentic AMD" :D
Elen síla lúmenn' omentielvo
- DreamOS64 - My latest attempt with osdev: https://github.com/dreamos82/Dreamos64
- Osdev Notes - My notes about osdeving! https://github.com/dreamos82/Osdev-Notes
- My old Os Project: https://github.com/dreamos82/DreamOs
thewrongchristian
Member
Member
Posts: 426
Joined: Tue Apr 03, 2018 2:44 am

Re: Exception when try to print text

Post by thewrongchristian »

finarfin wrote:
thewrongchristian wrote: The #PF address, if you interpret it as ASCII characters, gives "Auth", which is the first 4 characters of the "AuthenticAMD" (I'm guessing you have an AMD processor?) string deposited in ebx.

I got the register convention wrong, $ecx is a caller saved register, it's $ebx that is causing the problem here, as that should be callee saved, but in this case is being overwritten by the CPUID instruction with the "Auth" bit of "AuthenticAMD".

So you remedy here is to save and restore $ebx in your CPUID functions.
Why only $ebx? The result is stored in ebx, ecx,edx, why i should save only ebx? And i change also the eax status.

Btw i'm using qemu that for some reason is returning "Authentic AMD" :D
$eax, $edx and $ecx are all caller saved registers. A function is free to trash those registers, the caller function is responsible for preserving their contents if required.

$ebx, $edi, $esi, $ebp and $esp are all incumbent on the callee to save. Their value must be preserved across function calls according to the cdecl calling convention generally used by x86 compilers.

Your code was updating $ebx as a result of CPUID, so the calling code clearly was using that as a pointer expecting it to be preserved, hence the #PF.

$ecx and $edx (as well as $eax) are not expected to be preserved, so the calling function does not assume they contain the contents they had prior to your CPUID function call, and will reload any values into those registers as required after the function.

Calling_Conventions
Calling conventions (caller clean up)
Post Reply