Page 1 of 1

Extraneous digits appearing in display of memory map

Posted: Fri Jul 29, 2022 11:36 pm
by Schol-R-LEA
Image

I have been working on my Verbum project (a legacy BIOS boot loader that loads off a floppy image) and have come across an odd problem. I have written some diagnostics to show the memory map as collected via INT 0x15, EAX=E820, including a section which prints the descriptive explanation of what a given entry type is.

However, as you can see in the screenshot, after the first entry, each subsequent entry has an unwanted digit '1' before the length, type, and extended information fields. I can't seem to figure out where these digts are coming from, and why they are appearing where they are. Any advice on this would be appreciated.

the relevant code (not including the support functions for printing values) is:

Code: Select all

;;; print_hi_mem_map - prints the memory table
;;; Inputs:
;;;       BP   = the number of entries found
;;;       [DI] = the memory map table
;;; Outputs:
;;;       screen
;;; Clobbers:
;;;       AX, CX, SI
print_hi_mem_map:  
        jc .failed              ; if the interrupt isn't supported, fail
        cmp bp, 0
        jz .failed              ; if there are no valid entries, fail
        write mmap_prologue
        mov si, print_buffer    ; print the description of the section...
        push ax
        mov ax, bp
        call print_decimal_word ; including the number of entries found...
        write mmap_entries_label
        pop ax
        write mmap_headers      ;and the headers for the columns.
        write mmap_separator
        mov cx, bp            ; set the # of entries as the loop index

        push si
        push di
        
    .loop:
        ; write each of the structure fields with a spacer separating them
        push di
        add di, High_Mem_Map.base ; print the base value
        call print_hex_qword
        write mmap_space
        pop di
        push di
        add di, High_Mem_Map.length ; print the length value
        call print_hex_qword
        write mmap_space
        pop di
        push di
        add di, High_Mem_Map.type ; use the type value as an index into the array of strings
        mov si, mmap_types        ; get the array head
        mov ax, [di]              ; get the offset
        mov bl, mmap_types_size   ; multiply the offset by the size of the array elements
        imul bl
        add si, ax              ; print the appropriate array element
        call print_str
        write lparen            ; print the actual value of the type in parentheses
        mov si, print_buffer    
        mov ax, [di]
        call print_decimal_word
        write rparen
        write mmap_space
        pop di
        push di
        add di, High_Mem_Map.ext ; print the extended ACPI 3.x value 
        mov ax, [di]
        call print_decimal_word
        write newline
        pop di
        add di, ext_mmap_size ; advance to the next entry
        loop .loop
        
    .finish:
        pop di
        pop si
        ret
        
    .failed:
        write mmap_failed
        ret

Re: Extraneous digits appearing in display of memory map

Posted: Sat Jul 30, 2022 1:22 am
by iansjack
Don't you think that the print_hex_qword routine (and the write routine and various constants) is also relevant? It looks like the most likely cause of your error.

Re: Extraneous digits appearing in display of memory map

Posted: Sat Jul 30, 2022 1:31 am
by nullplan
Have you tried debugging? It appears the extraneous 1s appear wherever you are doing a "write mmap_space" as soon as the "Ext." column is printed the first time. Does that maybe change the string?

The stuff here does not look like it directly breaks anything. I think, the devil is going to be in the details.

Re: Extraneous digits appearing in display of memory map

Posted: Sat Jul 30, 2022 3:31 am
by Demindiro
nullplan is spot on: If you apply the following change it's obvious mmap_space is being corrupted:

Code: Select all

diff --git a/src/PC-x86/nasm/fat12/stagetwo.asm b/src/PC-x86/nasm/fat12/stagetwo.asm
index 8d10a3f..b4e5d44 100755
--- a/src/PC-x86/nasm/fat12/stagetwo.asm
+++ b/src/PC-x86/nasm/fat12/stagetwo.asm
@@ -332,7 +332,8 @@ mmap_prologue                db 'High memory map (', NULL
 mmap_entries_label           db ' entries):', CR,LF,NULL
 mmap_headers                 db 'Base Address       | Length             | Type                  | Ext.', CR, LF, NULL
 mmap_separator               db '----------------------------------------------------------------------------', CR,LF, NULL
-mmap_space                   db '     ', NULL
+;mmap_space                   db '     ', NULL
+mmap_space                   db '--------', NULL
 
 mmap_entries                 resd 1
2022-07-30-113047_701x418_scrot.png
What the issue is exactly I do not know yet though.

EDIT: You're missing this:

Code: Select all

diff --git a/src/PC-x86/nasm/fat12/stagetwo.asm b/src/PC-x86/nasm/fat12/stagetwo.asm
index 8d10a3f..bcda829 100755
--- a/src/PC-x86/nasm/fat12/stagetwo.asm
+++ b/src/PC-x86/nasm/fat12/stagetwo.asm
@@ -281,6 +281,7 @@ print_hi_mem_map:
         push di
         add di, High_Mem_Map.ext ; print the extended ACPI 3.x value 
         mov ax, [di]
+        mov si, print_buffer
         call print_decimal_word
         write newline
         pop di
2022-07-30-113451_694x414_scrot.png

Re: Extraneous digits appearing in display of memory map

Posted: Sat Jul 30, 2022 9:46 am
by Schol-R-LEA
@iansjack: Yes, you are right, those are relevant.

macros.inc

Code: Select all

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; macros
;

%ifndef MACROS__INC
%define MACROS__INC

%define zero(x) xor x, x


%macro write 1
   mov si, %1
   call near print_str
%endmacro


%endif

consts.inc

Code: Select all

;;; Constants related to the text console

%ifndef CONSTS__INC
%define CONSTS__INC


;;; character constants
NULL            equ 0x00        ;; end of string marker
CR              equ 0x0D        ;; carriage return
LF              equ 0x0A        ;; line feed 

ascii_zero      equ 0x30
ascii_upper_A   equ 0x41
ascii_lower_a   equ 0x61

%endif

simple_print_text_code.inc

Code: Select all

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; routine for printing strings

%ifndef SIMPLE_TEXT_PRINT_CODE__INC
%define SIMPLE_TEXT_PRINT_CODE__INC

%include "macros.inc"

;;; print_str - prints the string point to by SI
;;; Inputs:
;;;        ES:SI - string to print
print_str:
        pusha
        mov ah, ttype       ; set function to 'teletype mode'
        zero(bx)
        mov cx, 1
    .print_char:
        lodsb               ; update byte to print
        cmp al, NULL        ; test that it isn't NULL
        jz short .endstr
        int  VBIOS          ; put character in AL at next cursor position
        jmp short .print_char
    .endstr:
        popa
        ret

%endif
print_hex_code.inc

Code: Select all

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; routine for printing integer values in hex

%ifndef PRINT_HEX_CODE__INC
%define PRINT_HEX_CODE__INC

%include "consts.inc"
%include "macros.inc"
%include "simple_text_print_code.inc"

;;; convert_hex - convert one byte to hexidecimal value   
;;; Based on code from the _AMD Athlon Optimization Guide_, p. 84 
;;; (http://www.bartol.udel.edu/mri/sam/Athlon_code_optimization_guide.pdf)
;;; Input:
;;;      AL = byte to convert
;;;      ES = segment where buffer resides
;;;      SI = buffer to write to
;;; Output:
;;;      [SI] = written buffer
convert_hex:
        push bx
        mov bl, al
.hi_nibble:
        shr al, 4               ; convert the high nibble first
        cmp al, 10              ; if x is less than 10, set carry flag
        sbb al, 0x69            ; 0..9 –> 96h, Ah..Fh –> A1h...A6h
        das                     ;0..9: subtract 66h, Ah..Fh: Sub. 60h
        mov [si], al            ;save conversion in SI
.lo_nibble:
        inc si
        mov al, bl
        and al, 0x0F            ; clear high nibble
        cmp al, 10              ; if x is less than 10, set carry flag
        sbb al, 0x69            ; 0..9 –> 96h, Ah..Fh –> A1h...A6h
        das                     ;0..9: subtract 66h, Ah..Fh: Sub. 60h
        mov [si], al            ;save conversion in SI
        pop bx
        ret

;;; print_hex_byte - convert a byte to hex anbd print it to console
;;; convert_hex - convert one byte to hexidecimal value                    ;
;;; Input:
;;;      AL = byte to print
;;;      ES = segment where buffer resides
;;;      SI = buffer to print
;;; Output:
;;;      screen
;;; Clobbers:
;;;      AL, SI
print_hex_byte:
        mov si, word hex_buffer
        call near convert_hex
        mov si, word hex_buffer
        call near print_str
        ret
        
;;; print_hex_word - convert a word to hex and print it to console
;;; Input:
;;;      AX = word to print
;;;      ES = segment where buffer resides
;;;      SI = buffer to print
;;; Output:
;;;      screen
;;; Clobbers:
;;;      AL, SI
print_hex_word:
        xchg ah, al
        call near print_hex_byte
        xchg ah, al
        call near print_hex_byte
        ret

;;; print_hex_seg_offset - print a segment:offset pair
;;; Input:
;;;      GS = segment
;;;      AX = offset
;;; Clobbers:
;;;      AL, SI
print_hex_seg_offset:
        push ax
        mov ax, gs
        call near print_hex_word
        mov si, colon
        call near print_str
        pop ax
        call near print_hex_word        
        ret

hex_buffer   db 0, 0, NULL      ; two bytes, plus null delim
colon        db ':', NULL

align 4 

%endif
print_hex_long.inc

Code: Select all

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; routine for printing doubleword and quadword integer values in hex

%ifndef PRINT_HEX_LONG_CODE__INC
%define PRINT_HEX_LONG_CODE__INC

%include "consts.inc"
%include "macros.inc"
%include "simple_text_print_code.inc"
%include "print_hex_code.inc"



print_hex_dword:
;;; print_hex_dword - convert a doubleword to hex and print it to console
;;; Input:
;;;      [DI] = word to print
;;;      ES   = segment where buffer resides
;;;      SI   = buffer to print
;;; Output:
;;;      screen
;;; Clobbers:
;;;      AX, SI
        mov ax, [di+2]
        call print_hex_word
        mov ax, [di]
        call print_hex_word
        ret

print_hex_qword:
;;; print_hex_qword - convert a quad word to hex and print it to console
;;; Input:
;;;      [DI] = word to print
;;;      ES   = segment where buffer resides
;;;      SI   = buffer to print
;;; Output:
;;;      screen
;;; Clobbers:
;;;      AX, SI
        mov ax, [di+6]
        call print_hex_word
        mov ax, [di+4]
        call print_hex_word
        mov ax, [di+2]
        call print_hex_word
        mov ax, [di]
        call print_hex_word
        ret

%endif

Re: Extraneous digits appearing in display of memory map

Posted: Sat Jul 30, 2022 9:51 am
by Schol-R-LEA
Thank you, @Demindiro, that's exactly what I needed.

Re: Extraneous digits appearing in display of memory map

Posted: Sat Jul 30, 2022 6:18 pm
by Schol-R-LEA
Just out of curiosity, does anyone have an opinion on the use of include files for managing the code this way? I started it as a way of sharing common code between the boot sector and the second stage loader, and to reduce the clutter in the source code files, but I found that it also made it easier to sandbox the different code sections for testing. Sine there's no simple way to link code for a raw binary like these, I figured this was the next best approach.

I also have been using the NASM struc/istruc directives more than I have seen them in other boot loaders of this type, and I was wondering what people thought of it.

EDIT: I just fixed a serious error in my GDT data structures and declarations, which may affect the way these are seen.

Re: Extraneous digits appearing in display of memory map

Posted: Sat Jul 30, 2022 10:36 pm
by nullplan
Schol-R-LEA wrote:Sine there's no simple way to link code for a raw binary like these, I figured this was the next best approach.
Well, there is a way to link raw binary assembler code, but it is longer. If you assemble into an object file, you can link to ELF before dumping to binary. This is what I do for writing boot sectors with GAS (I mainly wanted to see if it was possible, and it is).

Of course, with that, everything becomes complicated. You must assemble with base address 0, then link to base address 7c00h. If you wish to link multiple object files, you will likely need to mark your main function with a special section and link that in first, so you will need a linker script, and oh my god, I just wanted to write a boot sector.

So yeah, not the most convenient way.

As to your question, I have always held that code duplication is a scourge to be eradicated wherever found, and this is one way to do it.
Schol-R-LEA wrote:I also have been using the NASM struc/istruc directives more than I have seen them in other boot loaders of this type, and I was wondering what people thought of it.
Well, I don't use those. Partly because GAS doesn't support anything like it, but mainly because those are just offset calculations. So I might as well calculate the offsets immediately. I find the whole syntax too cumbersome and clunky. But that is a matter of taste.

Remember, I'm also the guy who declares his GDT as just an array of uint64_t. I never understood why people thought it had to be more complicated than that.

Re: Extraneous digits appearing in display of memory map

Posted: Sun Jul 31, 2022 3:48 am
by xeyes
Schol-R-LEA wrote:Just out of curiosity, does anyone have an opinion on the use of include files for managing the code this way? I started it as a way of sharing common code between the boot sector and the second stage loader, and to reduce the clutter in the source code files, but I found that it also made it easier to sandbox the different code sections for testing. Sine there's no simple way to link code for a raw binary like these, I figured this was the next best approach.
If the 1st stage loader doesn't load the 2nd in a way that totally overwrites itself, another way might be creating your own "dynamic linking" with a dispatch table that can be used for indirect calling?

The 1st stage would have to host the implementation of the functions/macros. It thus has the needed knowledge to fix up the table in the 2nd stage while loading the later. Otherwise the 1st stage can pass along any needed info so the 2nd stage can fix the table up itself before it uses the table.
Schol-R-LEA wrote: I also have been using the NASM struc/istruc directives more than I have seen them in other boot loaders of this type, and I was wondering what people thought of it.

EDIT: I just fixed a serious error in my GDT data structures and declarations, which may affect the way these are seen.
Everyone's style is different, if you find a tool's features useful it's good in my book. I use gcc/gas and like its support for macros quite a bit as they can be written in very similar ways as the macros in C, thus easily sharable with C code. Perhaps gcc indeed runs the C preprocessor on ASM files before invoking gas?

Speaking of GDT, yesterday I thought that I've discovered a serious bug that has somehow able to lurk around for a long time (descriptors of type 0xA which is a reserved type) until I later re-learned that the system bit: 1 means not a system segment and system bit: 0 means a system segment. So intuitive! #-o

Re: Extraneous digits appearing in display of memory map

Posted: Sun Jul 31, 2022 9:24 am
by Schol-R-LEA
xeyes wrote:
Schol-R-LEA wrote:Just out of curiosity, does anyone have an opinion on the use of include files for managing the code this way? I started it as a way of sharing common code between the boot sector and the second stage loader, and to reduce the clutter in the source code files, but I found that it also made it easier to sandbox the different code sections for testing. Sine there's no simple way to link code for a raw binary like these, I figured this was the next best approach.
If the 1st stage loader doesn't load the 2nd in a way that totally overwrites itself, another way might be creating your own "dynamic linking" with a dispatch table that can be used for indirect calling?

The 1st stage would have to host the implementation of the functions/macros. It thus has the needed knowledge to fix up the table in the 2nd stage while loading the later. Otherwise the 1st stage can pass along any needed info so the 2nd stage can fix the table up itself before it uses the table.
I actually considered that, as it happens. Unfortunately, the code for passing the table didn't fit in my boot sector. I might revisit the idea, though, thank you for reminding me of that.

Re: Extraneous digits appearing in display of memory map

Posted: Sun Jul 31, 2022 11:36 am
by nullplan
xeyes wrote:If the 1st stage loader doesn't load the 2nd in a way that totally overwrites itself, another way might be creating your own "dynamic linking" with a dispatch table that can be used for indirect calling?
Why indirect calling? I had once tried to write a bootloader for ext2. Didn't end up working out because I tried to do too much at once (I would have needed another stage), but in that case I simply tried to write one file that would have been 1024 bytes long. With a break and some padding in the middle. The first half contained the main program (by necessity) as well as the routine to load another block or sector, and the second half contained a lot of the rest. The whole thing was planned to work by loading the second and third sectors of the volume to 0x7E00, loading the second half of the bootloader and the superblock in one go.

Unfortunately, it ended up being 2kB in size. Apparently, ext2 path traversal, proper i-node and BG handling, all of the A20 methods, CPU interrogation for 64-bit compatibility and ELF file mapping for 64-bit mode all ended up being a bit much. I should probably start over, add another stage in the middle. I have seen a program online that loads i-node 5 into memory, but there are no standard tools that allow me to create a file as i-node 5. Oh well, guess I need to write some tooling.

But yes, it is entirely viable to write a larger assembler program and break it up with dd or similar, and load it into memory contiguously again at run time. Then you don't need any indirect calls but can do direct calls again. If need be, you can even put assembler directives in to make a break where other stuff will go at run time. In my example, I can put 1kB of code from the start, then the superblock must come. But I could add more code behind the superblock. So I would create an assembler file that creates 1kB of code, then 512 bytes of zeroes (with labels added for the superblock fields), then more code. Break it up after building and you have something to install as bootloader and something to install as i-node 5.

Re: Extraneous digits appearing in display of memory map

Posted: Mon Aug 01, 2022 8:24 pm
by xeyes
Schol-R-LEA wrote:
xeyes wrote:
Schol-R-LEA wrote:Just out of curiosity, does anyone have an opinion on the use of include files for managing the code this way? I started it as a way of sharing common code between the boot sector and the second stage loader, and to reduce the clutter in the source code files, but I found that it also made it easier to sandbox the different code sections for testing. Sine there's no simple way to link code for a raw binary like these, I figured this was the next best approach.
If the 1st stage loader doesn't load the 2nd in a way that totally overwrites itself, another way might be creating your own "dynamic linking" with a dispatch table that can be used for indirect calling?

The 1st stage would have to host the implementation of the functions/macros. It thus has the needed knowledge to fix up the table in the 2nd stage while loading the later. Otherwise the 1st stage can pass along any needed info so the 2nd stage can fix the table up itself before it uses the table.
I actually considered that, as it happens. Unfortunately, the code for passing the table didn't fit in my boot sector. I might revisit the idea, though, thank you for reminding me of that.
Yes, space is certainly at a premium for early boot loader stages.

When you get around to think more about it, maybe consider having the first stage load a 'toolbox' section or stage 1.5 which hosts the actual implementations and the table? Then both the first and the second stage can reach into the toolbox for needed functions/macros.

nullplan wrote:
xeyes wrote:If the 1st stage loader doesn't load the 2nd in a way that totally overwrites itself, another way might be creating your own "dynamic linking" with a dispatch table that can be used for indirect calling?
Why indirect calling?
Indirect as an optimization for space? Otherwise there need to be code that parse newly loaded code and fix all the direct calling sites, etc.
nullplan wrote: I had once tried to write a bootloader for ext2. ... Unfortunately, it ended up being 2kB in size. ...

I have seen a program online that loads i-node 5 into memory, but there are no standard tools that allow me to create a file as i-node 5. Oh well, guess I need to write some tooling.
Supporting ext2 in such limited space sounds too adventurous for me. Just looked, my function to add blocks to an inode is more than 3KB in size and it doesn't even support triple indirect blocks.

I've seen an old thread here talking about a 512B boot sector version that loads inode 5 as well, a great feat IMO.