Page 2 of 3

Understanding a Good Generic FAT12 Floppy Boot Sector

Posted: Fri May 20, 2016 12:46 am
by ~
Image

6. Calculate LBA Sectors for the Different Disk Areas

Now we need to get the following values:
- LBA sector for the Root Directory
- Number of sectors in the Root Directory

- LBA sector for the Cluster Area (after the Root Directory Sectors)

For this, we are interested in the number of sectors in the FAT table. Normally there are 9 sectors and are 2 FATs, so we have 18 FAT sectors. We also have 1 reserved sector for the Boot Record.

So the FAT starts at LBA sector 1, and the Root Directory starts at LBA sector 19.

Normally we have 14 sectors in the root directory. Every entry in the root directory is 32 bytes in size.

The cluster area normaly starts at LBA sector 33, and clusters (which are virtual numbers) start from number 2.

So for FAT12 we normally have:
- 1 boot sector
- 12-bit FAT entries for Cluster Numbers; an effective 1 byte plus 4 bits of another byte, reusing the other 4 free bits
- 18 FAT table sectors; only the first 9 are the FAT 1
- 14 root directory sectors, 224 file/directory entries of 32 bytes
- The rest of sectors are the data area for file/directory entries and for the actual file contents
- Clusters containing 1 single sector

- 512-byte sectors
- 2 Heads
- 80 Cylinders
- 18 sectors per track (double the size as one copy of the FAT table, 2 FATs per track)

Code: Select all

;6. Configure program disk parameter information here:
;;

 ;;INIT: Configure disk information (calculate geometry)
 ;;INIT: Configure disk information (calculate geometry)
 ;;INIT: Configure disk information (calculate geometry)
  ;>   eax = 00000000

  ;Here we are in UnReal Mode since the previous INIT:END:
  ;Here we are in UnReal Mode since the previous INIT:END:
  ;Here we are in UnReal Mode since the previous INIT:END:
  ;;
   xor ax,ax
   mov al,[_10h_numOfFATs]      ;Number of FATs
   mul byte[_16h_numFATsectors] ;Times size of FAT (in sectors)
   add ax,[_0Eh_reservedSects]  ;Plus Sectors before first FAT - currently just the boot sector
        ;(_10h_numOfFATs*_16h_numFATsectors)+_0Eh_reservedSects
           ;EAX = LBA of Root Directory == normally 19
   mov [_RootDirSect],ax
   mov [_ClustAreaSect],ax


   movzx edx,word[_11h_numRootDirEntries]

   shr dx,4   ;16 directory entries per sector.
              ;Here we are getting the number of sectors instead
              ;by dividing the total number of directory entries
              ;across the root directory sectors.

 mov [_RootDirSectCount],dx
 add [_ClustAreaSect],dx

 ;;END:  Configure disk information (calculate geometry)
 ;;END:  Configure disk information (calculate geometry)
 ;;END:  Configure disk information (calculate geometry)



Image

Understanding a Good Generic FAT12 Floppy Boot Sector

Posted: Fri May 20, 2016 1:29 am
by ~
Image

10. Reading Floppy Disk Sectors Using the BIOS

The following function is used to read sectors from the floppy using INT 13h service AH=2

It also uses INT 13h service AH=0 to recalibrate the disk when it finds errors, and potentially retries each read up to 3 times before trying to reboot the machine with INT 19h. The number of those sector read retries is controlled with the BP register, so if we want to retry more times later, we can increase the value 3 that we pass it with a higher 16-bit value.

This function asks for a 32-bit LBA sector number (the BIOS only uses a maximum of 6 bits for the CHS Ssector S value, but the 32-bit LBA value is divided into C, H and S).

It also asks for a sector count and a Real-Mode segment address to place the data, so we will need to keep and update a variable containing a segment to advance by at least 20h (200h) instead of a regular offset.

The function seems to make an effort to make reads on track boundaries (18 sectors maximum in a normal floppy, although it actually can request for less and at any sector point in a track, the BIOS does the heavy floppy manipulation internally).

In this boot sector, we only read 1 sector at a time for reading the root directory and the file data. For the FAT table, we read 2 sectors at a time to implement an optimization in which we can read a cluster at offset 511 easily by reading the other byte out of 2 bytes which will always be at the next sector.

After including and studying this function for some time, we can get to explain the most important part of the boot sector, which searches for the file in the root directory, reads it and passes control to it as 16-bit Un/Real-Mode boot manager/booting kernel that will detect the machine capabilities and which will later load the more complex operating system.

Code: Select all

;10. read_sectors BIOS-based function:
;;

 ;;INIT: read_sectors
 ;;INIT: read_sectors
 ;;INIT: read_sectors
  ; Input:
  ;       EAX = LBA
  ;        DI = sector count
  ;        ES = segment
  ; Output:
  ;       EBX high half cleared
  ;       DL = drive # number
  ;       EDX high half cleared
  ;       ESI = 0
  ; Clobbered:
  ;       BX, CX, DH, BP
  ;;
  read_sectors:
   ;Save EAX, ES and DI:
   ;;
    push eax
    push es
    push di
    
  .sectorLoop:
   push  eax        ;LBA sector value

  ;Sign-extend EAX into EDX:EAX,
  ;so here we intend to get EDX=0:
  ;;
   cdq

  ;Perform EDX:EAX / EBX
  ;
  ;Perform LBA/Sectors Per Track. It will give us
  ;the track number in EAX and
  ;the zero-based sector value in EDX:
  ;;
   movzx ebx, byte[_18h_sectorsPerTrack]
   div   ebx        ;EAX=track; EDX=sector-1

  ;Get a copy of the sector value in CX.
  ;Now substract it from the sectors per track
  ;to know how many sectors to read from a track:
  ;;
   mov   cx, dx     ;CL=sector-1; CH=0
   sub   bl, dl     ;BX=max transfer before end of track

  ;See if the number of requested sectors by the user
  ;are more than the remaining sectors in the track.
  ;If so, go, jump and read the maximum number of sectors.
  ;If not, just read the remaining number of pending sectors
  ;requested by the user:
  ;;
   cmp   di, bx             ;Do we want more than that?
   ja short .sectorOverflow ;Yes, do this much now
   mov   bx, di             ;No, do it all now

  .sectorOverflow:
  ;Save a copy of the calculated remaining
  ;sectors-in-a-track count:
  ;;
   mov  esi, ebx    ;Save count for this transfer

  ;Convert the zero-based sector count for the CHS S value:
  ;;
   inc  cx          ;CL=sector number

  ;Here EDX is just 0 to perform a clean 32-bit division.
  ;
  ;Perform EDX:EAX divided by EBX
  ;
  ;Track_Number / Number_of_Heads. Now we will get
  ;the Cylinder number in EAX and
  ;the Head number in EDX:
  ;;
   xor  dx,  dx
   mov  bl,  [_1Ah_numHeads]
   div  ebx         ;EAX=cylinder; EDX=head



   mov  dh,  dl           ;*DH=Head CHS H value
   mov  dl,  [_00h_jmp]   ;*DL=Drive number stored at the start


  ;Get a packed value of Cylinder and Sector:
  ;
  ;Sector   S is in bits 0-5 of CL and
  ;Cylinder C is in bits 0-7 of CH and bits 6-7 of CL
  ;;
   xchg ch,  al           ;*CH=cylinder number [0:7]; AL=0
   shr  ax,  2            ;AL[6:7]=High two bits of cylinder
   or   cl,  al           ;*CL=cylinder [8:9] and sector [0:5]



   mov  ax,  si           ;*AL=Remaining sectors-in-a-track count
   mov  ah,  2            ;*Service: Read
   xor  bx,  bx           ;*ES:BX -- Destination buffer pointer

  ;Here we save AX for being able to retry since
  ;this BIOS service destroys the original value
  ;if AX we pass it.
  ;
  ;Parameters:
  ;     AH = 2 -- Read Sectors
  ;     AL = non-zero Sector Count
  ;     CH = Cylinder      bits 0-7 into 0-7
  ;     CL = Cylinder      bits 8-9 into 6-7
  ;          Sector Number bits 0-5
  ;
  ;     DH = Head Number
  ;     DL = Drive Number (bit 7 set for hard disks)
  ;;
   mov bp,3
   .retry:
    push ax
     int 13h
    pop ax
    jnc .OK_noerror
    .retryRecalibrate:
    push ax
     xor ax,ax
     int 13h
    pop ax
    jc .retryRecalibrate
   dec bp
   jnz short .retry

   ;If we are here, there was a boot error,
   ;so try booting again:
   ;;
    mov ax,0xE07 ;{3}
    int 10h      ;Video service. Here we will "ring" the bell.
    int 19h      ;Since this is an error we found, we reboot.


  .OK_noerror:
   pop  eax
   add  eax, esi          ;Advance LBA

  ;Here we convert the value of SI, which contained the
  ;remaining sectors-in-a-track value, to a 16-bit segment
  ;value, converting it from sector count to byte count
  ;(with the implicit 4-bit right shift for the segment value).
  ;
  ;This will allow us to advance as many 512-byte memory chunks
  ;as we just read:
  ;;
   push si
   shl  si, 5
   mov  bx, es
   add  bx, si            ;Advance segment
   mov  es, bx
   pop  si

  ;Substract the remaining sectors-in-a-track count (in SI)
  ;from the number of sectors requested by the user,
  ;If such number of user-requested sectors (in DI) is still
  ;greater than the remaining sectors-in-a-track, 
  ;;
   sub  di, si
   ja   .sectorLoop

   ;Restore EAX, ES and DI:
   ;;
    pop  di
    pop  es
    pop  eax
  ret
 ;;END:  read_sectors
 ;;END:  read_sectors
 ;;END:  read_sectors


Image

Re: Understanding a Good Generic FAT12 Floppy Boot Sector

Posted: Fri May 20, 2016 1:37 am
by Octocontrabass
I wanted to go through your code earlier and point out issues, but I didn't have time. Now I do.
~ wrote:

Code: Select all

   xor eax,eax   ;Get value of Code Segment
   push cs       ;Get value of Code Segment
   pop ax        ;Get value of Code Segment
   mov ds,ax     ;Make DS=CS
Clearing EAX has absolutely nothing to do with getting the value of CS. Do you even use the high bits of EAX anywhere?

Why bother moving the value to AX at all? You can just push CS and pop DS.
~ wrote:We will use the area 80000h-9FBFFh to hold the stack.
You must ask the BIOS if memory is available before trying to use it. The EBDA may be larger than you think, and some (mostly very old) computers don't put RAM at those addresses at all.
~ wrote:

Code: Select all

  mov al,9             ;Segment 0x90000.
  shl ax,12            ;Segment 0x90000.
  mov ss,ax            ;Segment 0x90000.
If a value is a constant, let the assembler calculate it for you. For example, "mov ax, 9 << 12".
~ wrote:Enabling Line A20 (ISA - PS/2 Keyboard Controller Method)
Before you try to enable A20, you should verify that it's not already enabled.

If A20 is already enabled, there may not be a PS/2 keyboard controller. Attempting to access nonexistent hardware may cause bad things to happen.
~ wrote:Once we have Unreal Mode set up along with the A20 line enabled and a good stack, we simply reenable interrupts to continue with the actual loading of a bootup program:
Interrupts may return in real mode instead of unreal mode. Interrupts may happen at any time. This problem can be solved by using the GPF handler to switch to unreal mode, since accesses outside the 64k segment limit will cause a GPF.

Re: Understanding a Good Generic FAT12 Floppy Boot Sector

Posted: Fri May 20, 2016 2:20 am
by ~
Image

With the optimizations below we got 12 free bytes and previously we had just 8 free bytes.
I will see if I can use those for something important like implementing a safer enabling of the A20 line (by using the BIOS functions).

I could also get rid of the Code Selector from the GDT, but I want to keep it in the boot sector by now as a good study reference of a minimal GDT even if it's never used in the boot sector.

Octocontrabass wrote:I wanted to go through your code earlier and point out issues, but I didn't have time. Now I do.
~ wrote:

Code: Select all

   xor eax,eax   ;Get value of Code Segment
   push cs       ;Get value of Code Segment
   pop ax        ;Get value of Code Segment
   mov ds,ax     ;Make DS=CS
Clearing EAX has absolutely nothing to do with getting the value of CS. Do you even use the high bits of EAX anywhere?
EAX=0 is to make sure to get clean divisions; the LBA address that derives C, H and S does different divisions/multiplications of EDX:EAX by other 32-bit value. I have had some subtle registers in the past that prevent booting and I'm afraid that EAX won't be clean in some case, so I better make sure.

At least Bochs boot without the "xor eax,eax", but I better test the code much more before applying the removal of that instruction.

Octocontrabass wrote:Why bother moving the value to AX at all? You can just push CS and pop DS.
That one was indeed an inefficiency. By fixing it looks like this:

Code: Select all

jmp BASEADDR_SEG:start
start:
xor eax,eax
cli
push cs
pop ds


Octocontrabass wrote:
~ wrote:We will use the area 80000h-9FBFFh to hold the stack.
You must ask the BIOS if memory is available before trying to use it. The EBDA may be larger than you think, and some (mostly very old) computers don't put RAM at those addresses at all.
~ wrote:

Code: Select all

  mov al,9             ;Segment 0x90000.
  shl ax,12            ;Segment 0x90000.
  mov ss,ax            ;Segment 0x90000.
If a value is a constant, let the assembler calculate it for you. For example, "mov ax, 9 << 12".
The shift instruction is indeed an inefficiency. I tried to save bytes but I ended up using 5 bytes instead of 3 with the simple "mov ax,9000h", so I fixed it.

Code: Select all

  mov ax,9000h         ;Segment 0x90000.
  mov ss,ax            ;Segment 0x90000.
  mov sp,0xFC00        ;{1B}. Configure the end of the stack at 0xFBFC
                       ;the very end of the free memory area
                       ;from 80000h-9FBFFh

According to the Wiki that area from 80000h-9FBFFh is always unconditionally guaranteed free for use at this point in time, and I'm working based on the oldest 32-bit machine I have, which is an AMD 386SX with a minimum of 1 Megabyte of RAM, so this should be safe for the boot stack, just like loading at 0x500, 0x600 or 0x700. I probably won't have much space to find out free areas in the boot code so I probably should load the stack at 000:700h if there are really machines that have something at the other, otherwise supposedly free area.

http://wiki.osdev.org/Memory_Map#Overview



Octocontrabass wrote:
~ wrote:Enabling Line A20 (ISA - PS/2 Keyboard Controller Method)
Before you try to enable A20, you should verify that it's not already enabled.

If A20 is already enabled, there may not be a PS/2 keyboard controller. Attempting to access nonexistent hardware may cause bad things to happen.
I probably will have to use a BIOS service to enable the A20 after making sure that it exists. Using that BIOS service could probably be the safest way to enable the A20 line at this point in time, so I will try to update the code. At least it works well in Bochs by now, but it could be done better in that way.

However, I have tried to use the BIOS service to enable the A20 line in my 386SX but it seems to have no way to test if it is present (it isn't but it gets unstable and crashes if I try to test its presence reliably, and the interrupt vector seems to contain a seemingly valid pointer but crashes when used).


Octocontrabass wrote:
~ wrote:Once we have Unreal Mode set up along with the A20 line enabled and a good stack, we simply reenable interrupts to continue with the actual loading of a bootup program:
Interrupts may return in real mode instead of unreal mode. Interrupts may happen at any time. This problem can be solved by using the GPF handler to switch to unreal mode, since accesses outside the 64k segment limit will cause a GPF.
It shouldn't matter because we are loading everything within the first Megabyte anyway, but it would be better to have Unreal Mode working from the start, just like the A20 line.

The interrupts are disabled before calling anything once we load the boot sector, and stay disabled while we load the GDT, enable the A20 line, configure the stack and return to Un/Real Mode. It probably isn't so serious in the boot code. I have used boot sectors like this many times across several machines and it has never been the source of a failed boot (most of the failed boot cases are due to a bad floppy where it's not possible to find or to fully load the binary).

Currently it works well and fast at least in Bochs.

Image

Understanding a Good Generic FAT12 Floppy Boot Sector

Posted: Fri May 20, 2016 3:53 am
by ~
Image

7. Searching the Bootup Program by its 8.3 File Name

We will search for the file named "BOOTKERN.BIN". It fits the whole 11 bytes of the 8.3 file name.

Code: Select all

;11. 8.3 bootup file name:
;;
 kernelFile db "BOOTKERNBIN"



This is where the more interesting tricks begin from the point of view of booting a standalone binary (tricks to read the lesser amount of data at a time into memory).

The code demonstrates how we can test whether a file/directory entry is actually a file. To do this we just need to get the byte at offset 11 within the 32-byte entry itself and see if bit 4 is 0 (0x10). If that's the case, then it means that the entry corresponds to a file, or in other words, to a file. If it's 1, then it's a directory, and then we will skip to the next one.

We also need to make sure that bit 3 is 0 (0x08). If that's the case, it means that the directory entry is NOT a volume ID (I just remember when I tried to create several volume entries with 1DIR in an 8088 running from a 20-Megabyte hard disk).

So when testing with AND with the value 0x18, we must get 0 against byte 11 of the 32-byte directory entry.

We will read a single root directory sector at offset 7E0h:0000h and it will get reused when we read the FAT table 2 sectors at a time (which is NOT present in the block of code below).

Here we already know the location of the root directory, which normally is at LBA sector 19.

We will read 1 sector of the root directory at a time. This is to save space and make less reads, wearing the floppy much less and loading considerably faster.

We will compare the string in the first 11 bytes every 32 bytes. If we don't find it we load one more root directory sector until there are no more of them. If we find the file name, we break the loop, keep the starting cluster number and start loading its contents into memory in the next block of code.

Each file/directory entry is 32 bytes in size. The first 11 bytes contain the 8.3 file name.

Within the 32-byte entry we can find the starting cluster number at offset 26 relative to the start of the 32-byte file/directory entry. This cluster number is already cleaned up so we don't need to AND or right-shift it like the rest of the cluster values that we will extract from the FAT table. Normally it's always 2 for a freshly-formatted floppy.

We will unconditionally save the starting cluster number and will only make use of it if we really corresponds the "BOOTKERN.BIN" file.

But here what we are doing is trying to find that file, and in return simply store the starting cluster number.

Code: Select all

;7. Search 8.3 file name in root directory:
;;

 ;;INIT: Search file name in the root directory (read root directory sectors)
 ;;INIT: Search file name in the root directory (read root directory sectors)
 ;;INIT: Search file name in the root directory (read root directory sectors)

  ;>  EAX  = LBA of root directory
  ;>  EDI  = length of root directory in sectors
  ;> [SP]  = length of root directory in entries
  ;>  ESI  = 00000000

 ;Read current root dir sector at 0x7E0:
 ;;

.RootDirSectLoop:
  push word 0x7E0
  pop es

  xor di,di
  inc di

  ; Input:
  ;       EAX = LBA
  ;        DI = sector count
  ;        ES = segment
  ; Output:
  ;       EBX high half cleared
  ;       DL = drive # number
  ;       EDX high half cleared
  ;       ESI = 0
  ; Clobbered:
  ;       BX, CX, DH, BP
  ;;
   call read_sectors   ;CALL to AHEAD address


  ;Here we are still in UnReal Mode since the past 2 INIT--END blocks:
  ;Here we are still in UnReal Mode since the past 2 INIT--END blocks:
  ;Here we are still in UnReal Mode since the past 2 INIT--END blocks:
  ;;
   mov bx,16   ;16 Dir Entries Per Root Dir Sector, 32 bytes each
   

   xor edi,edi            ;Point at directory buffer {1C}
   xor esi,esi

  .20:
   test byte[es:di+11],0x18  ;See if entry is a file and not
   jnz short .skipDirent     ;a volume ID. If not, skip it.


   mov si,kernelFile    ;Name of file we want. We take the 16-bit
                        ;address of the string.
   xor ecx,ecx
   mov cl,11            ;Number of bytes to read (length of the string
                        ;"BOOTKERNBIN")
   push bx
   mov bx,di  ;Go to current Root Directory Sector Offset
   add bx,26  ;Go to its FAT Cluster Number
   mov bx,[es:bx]  ;Read raw Cluster Number
   mov [_CurrFileClust],bx  ;Store it for file read
   pop bx
   push edi
   a32 repe cmpsb       ;Found the file? Here is where we use
                        ;the 11 in CL
   pop edi



   je short found       ;Yes? This is the key to start loading
                        ;     our program into memory.
   .skipDirent:
   add di,32
   dec bx               ;Loop through all entries

   jnz short .20        ;According to the result of a bit in FLAGS, produced by
                        ;*dec bx*. Concretely the bit that indicates that the
                        ;operation has set the value of the operand down to 0.

inc ax  ;Go to the next Root Directory Sector LBA
dec word[_RootDirSectCount]
jnz short .RootDirSectLoop


  ;Couldn't find the file in directory, so
  ;reboot the machine to retry loading a system:
  ;;
   boot_error:
   int 19h


   found:
;NOTE: MAKE SURE THAT THE KERNEL FINDING CODE FOR THE
;ROOT DIRECTORY ALWAYS WORKS (COPY LOTS OF FILES, DELETE THEM
;AND SEE IF IT STILL IS FOUND). ... OK

 ;;END:  Search file name in the root directory (read root directory sectors)
 ;;END:  Search file name in the root directory (read root directory sectors)
 ;;END:  Search file name in the root directory (read root directory sectors)


Image

Re: Understanding a Good Generic FAT12 Floppy Boot Sector

Posted: Fri May 20, 2016 6:30 am
by Octocontrabass
~ wrote:According to the Wiki that area from 80000h-9FBFFh is always unconditionally guaranteed free for use at this point in time, and I'm working based on the oldest 32-bit machine I have, which is an AMD 386SX with a minimum of 1 Megabyte of RAM, so this should be safe for the boot stack, just like loading at 0x500, 0x600 or 0x700.
That area is usually free for use if it exists. Sometimes it's not free for use (the EBDA is larger than 1kB) and sometimes it doesn't exist (some motherboards do not put RAM in that range).

Since you mentioned a 386: the oldest 386-based PC can be configured with 256kB of RAM.
~ wrote:I probably won't have much space to find out free areas in the boot code so I probably should load the stack at 000:700h if there are really machines that have something at the other, otherwise supposedly free area.
My bootloader performs tasks like enabling A20 and switching to protected mode in a second stage portion, loaded from the filesystem. Some of the space gained from making that move is used to detect free memory below 1MB using interrupt 0x12.
~ wrote:I probably will have to use a BIOS service to enable the A20 after making sure that it exists. Using that BIOS service could probably be the safest way to enable the A20 line at this point in time, so I will try to update the code. At least it works well in Bochs by now, but it could be done better in that way.
Unfortunately, the BIOS service is not very reliable. It's not present on all PCs, and on some PCs it returns successfully but doesn't actually do anything.
~ wrote:However, I have tried to use the BIOS service to enable the A20 line in my 386SX but it seems to have no way to test if it is present (it isn't but it gets unstable and crashes if I try to test its presence reliably, and the interrupt vector seems to contain a seemingly valid pointer but crashes when used).
Can you dump a copy of the BIOS? I'd like to examine the code to see what it's doing.
~ wrote:It shouldn't matter because we are loading everything within the first Megabyte anyway, but it would be better to have Unreal Mode working from the start, just like the A20 line.
It would be nice, but I don't think there's enough room in a floppy disk boot sector.
~ wrote:The code demonstrates how we can test whether a file/directory entry is actually a file. To do this we just need to get the byte at offset 11 within the 32-byte entry itself and see if bit 5 is set to 1 (0x20). If that's the case, then it means that the entry corresponds to an archive, or in other words, to a file.
No. The archive bit is used to track file modification for backups. It has nothing to do with whether a directory entry is a file or a subdirectory.

The subdirectory bit is bit 4 (0x10) and is set to indicate a subdirectory, or clear to indicate a file.

Re: Understanding a Good Generic FAT12 Floppy Boot Sector

Posted: Fri May 20, 2016 8:50 am
by Combuster
~ wrote:According to the Wiki that area from 80000h-9FBFFh is always unconditionally guaranteed free for use at this point in time, and I'm working based on the oldest 32-bit machine I have, which is an AMD 386SX with a minimum of 1 Megabyte of RAM, so this should be safe for the boot stack, just like loading at 0x500, 0x600 or 0x700. I probably won't have much space to find out free areas in the boot code so I probably should load the stack at 000:700h if there are really machines that have something at the other, otherwise supposedly free area.

http://wiki.osdev.org/Memory_Map#Overview
Actually that is not what the wiki states. In addition, larger EBDAs are actually found on newer machines where memory is cheap, rather than old ones where it was still expensive, so that voids your test as well.

Re: Understanding a Good Generic FAT12 Floppy Boot Sector

Posted: Fri May 20, 2016 6:23 pm
by ~
Image
Octocontrabass wrote:
~ wrote:According to the Wiki that area from 80000h-9FBFFh is always unconditionally guaranteed free for use at this point in time, and I'm working based on the oldest 32-bit machine I have, which is an AMD 386SX with a minimum of 1 Megabyte of RAM, so this should be safe for the boot stack, just like loading at 0x500, 0x600 or 0x700.
That area is usually free for use if it exists. Sometimes it's not free for use (the EBDA is larger than 1kB) and sometimes it doesn't exist (some motherboards do not put RAM in that range).

Since you mentioned a 386: the oldest 386-based PC can be configured with 256kB of RAM.
Then it's probably easier to put the booting stack at 0x700, right before the loaded image so it has a 512-byte stack. Being a boot sector, the stack could be made more compatible with the structure that DOS uses, and then load the system on top of it all with another protected structure.

Octocontrabass wrote:
~ wrote:I probably won't have much space to find out free areas in the boot code so I probably should load the stack at 000:700h if there are really machines that have something at the other, otherwise supposedly free area.
My bootloader performs tasks like enabling A20 and switching to protected mode in a second stage portion, loaded from the filesystem. Some of the space gained from making that move is used to detect free memory below 1MB using interrupt 0x12.
The image I will load with this boot code will still work below 1 Megabyte, so I will have several Kilobytes to try to test and enable stuff, and maybe implement a simple DOS-like shell for this bootloader.

Octocontrabass wrote:
~ wrote:I probably will have to use a BIOS service to enable the A20 after making sure that it exists. Using that BIOS service could probably be the safest way to enable the A20 line at this point in time, so I will try to update the code. At least it works well in Bochs by now, but it could be done better in that way.
Unfortunately, the BIOS service is not very reliable. It's not present on all PCs, and on some PCs it returns successfully but doesn't actually do anything.
It looks like the most stable ways to enable the A20 line are testing whether it's already enabled before proceeding with a potentially unstabilizing method, and using the KBC, as well as keeping an open-source boot code and a lot of practical step-by-step documentation to make any necessary modification of the method feasible. So that what I've used here. I will try to make sure again in the bootloader itself that the A20 line is enabled, and I will add the usage of the Fast A20 and the BIOS INT 15h services AX=2400h, 2401h, 2402h and 2403h if I start getting actual booting problems.

I added code to see if the A20 line is already enabled. I will also use only the KBC method to enable it, since it has always worked in all of my machines and emulators. The good thing is that it is open source and will get better-documented.

Code: Select all

 ;To see if the A20 line is already enabled,
 ;look at address (7C00h+510) from 2 different
 ;addresses, one in the first Megabyte and the second
 ;in the second Megabyte.
 ;
 ;Compare the 16-bit word from 0000:7DFEh (7DFEh physical)
 ;and from FFFFh:7E0Eh (107DFEh physical).
 ;
 ;They must be different. If they aren't, then the A20 line
 ;is disabled and we will have to enable it:
 ;;
 mov ax,[510]
 push word 0xFFFF
 pop es
 cmp word[es:7E0Eh],ax
 jne short .A20alreadyEnabled

  ;Try to enable A20 line here using KBC, Fast A20 or BIOS....
  ;;

 .A20alreadyEnabled:


If testing for the A20 line to see if it's enabled already (it should be in the newest machines supposedly without KBC, ISA, PS/2 or Super I/O) and if the KBC method ever proves to be unreliable and not enough, I will simply move the code to the actual bootloader program and do what the Wiki says.

The BIOS INT 15h services AX=2400h, 2401h, 2402h and 2403h could be unreliable.

Octocontrabass wrote:
~ wrote:However, I have tried to use the BIOS service to enable the A20 line in my 386SX but it seems to have no way to test if it is present (it isn't but it gets unstable and crashes if I try to test its presence reliably, and the interrupt vector seems to contain a seemingly valid pointer but crashes when used).
Can you dump a copy of the BIOS? I'd like to examine the code to see what it's doing.
I will try to make it turn on since it suffered a battery leak decades ago and it has kept corroding. It has become increasingly difficult to make it work, and I probably need to apply soldering to the contacts that look bad, replace one or two capacitors and resistors, make some connections with wires for interrupted lines and get rid of any loose contact in the BIOS socket and other parts. But it's a BIOS and machine that looks like this (it's a fully ISA motherboard):

ImageImageImage


Octocontrabass wrote:
~ wrote:It shouldn't matter because we are loading everything within the first Megabyte anyway, but it would be better to have Unreal Mode working from the start, just like the A20 line.
It would be nice, but I don't think there's enough room in a floppy disk boot sector.
There's enough space to try the basics, currently I just need to write the part of the code that actually reads the intended file to run.


Octocontrabass wrote:
~ wrote:The code demonstrates how we can test whether a file/directory entry is actually a file. To do this we just need to get the byte at offset 11 within the 32-byte entry itself and see if bit 5 is set to 1 (0x20). If that's the case, then it means that the entry corresponds to an archive, or in other words, to a file.
No. The archive bit is used to track file modification for backups. It has nothing to do with whether a directory entry is a file or a subdirectory.

The subdirectory bit is bit 4 (0x10) and is set to indicate a subdirectory, or clear to indicate a file.
I already corrected it. I see how using that bit makes the filesystem imply that if something is not a directory then it's a file unless it's a volume entry.

Image

Understanding a Good Generic FAT12 Floppy Boot Sector

Posted: Fri May 20, 2016 7:50 pm
by ~
Image

After this part of the boot sector explanation, we need to put it all together, assemble and install it into a floppy.

Then, we need to create a simple test kernel with base address 0 and with 16-bit Un/Real Mode Code, to see if booting it really works as it should in all machines.


8. Reading the File

For FAT, we have a file/directory entry that contains a cluster number. For a new FAT12 filesystem floppy, that cluster will normally be 2.

So FAT dictates that we first read the file contents at Cluster 2, and then inspect the entry that corresponds to Cluster 2 in the FAT table.

From the FAT table, we must get another value, and once we get it and clean it up (getting rid of 4 extra bytes of the following cluster, by ANDing the highest 4 bits or right-shifting the lowest 4 bits), we must repeat the process to read the contents of the actual cluster and then inspect that cluster entry in the FAT table to follow the file's cluster chain.

Code: Select all

;8. Read file contents:
;;

 ;;INIT: Final processes and pass control to the kernel
 ;;INIT: Final processes and pass control to the kernel
 ;;INIT: Final processes and pass control to the kernel

.FileReadLoop:


  ;>>    ECX = 0000????
  ;>    [SP] = Next cluster of file
  ;>     ESI = 0000????
  ;>     EDX = 0000????
 ;>  ES:EDI = Destination address
  ;>     EBP = LBA of cluster 2
  ;>      DS = 0
  ;;
   xor ax,ax
   mov ax,[_CurrFileClust]  ;Get the most recent cluster number
   cmp ax,0xFF8        ;Valid cluster?
   jae short eof       ;No:  assume end of file
                       ;Yes: (c-bit set)



  ;Read file sector here
  ;;
   movzx edi,byte[_0Dh_sectsPerClust]  ;File data sectors to read
   push es
   push word[_FileBuffSegment]
   pop es


 ;Convert cluster number (which starts from 2) into LBA sector.
 ;The cluster area normally starts at the 34th sector (sector 33
 ;counting from 0). Cluster numbers start from 2, but to get their LBA address
 ;we must substract 2 again from the start cluster number, as set
 ;by the FAT algorithm. It looks like the minimum usable sector we can get
 ;for clusters as LBA is 33.
 ;
 ;It seems as if every major area was seen as a cluster within FAT, so
 ;the boot sector doesn't have a cluster number assigned,
 ;the area of FATs had special cluster number 0, and
 ;the root directory area had special cluster number 1,
 ;so normal cluster numbers start at 2, but we need to substract that 2
 ;to then, now obviously, get the correct LBA number.
 ;;
  add ax,[_ClustAreaSect]  ;Get first sector of cluster area
  dec ax
  dec ax  ;Remove the artifact of starting cluster numbers from 2


  ; Input:
  ;       EAX = LBA
  ;        DI = sector count
  ;        ES = segment
  ; Output:
  ;       EBX high half cleared
  ;       DL = drive # number
  ;       EDX high half cleared
  ;       ESI = 0
  ; Clobbered:
  ;       BX, CX, DH, BP
  ;;
   call read_sectors             ;CALL to AHEAD address



   mov cx,di  ;Copy sectors per cluster
   dec cx  ;Turn it into 0-based count for shift multiply
   mov di,[_0Bh_bytesPerSect]  ;Get bytes per sector
   shl di,cl  ;Get total bytes in cluster
   shr di,4   ;Convert to Real Mode segment
   add [_FileBuffSegment],di   ;Advance the base copy segment
   pop es

 ;Start getting next cluster:
   mov ax,[_CurrFileClust]
    push ax  ;Save original cluster
  mov si,ax
   shr si,1    ;Divide SI by 2. Now we have 0.5 of its cluster value
   add si,ax   ;Add original cluster value, now we have 1.5 of its value

  push si  ;Save multiplied value

 ;Read 2 FAT sectors:
     mov ax,si  ;Get multiplied cluster for actual sector offset
    shr ax,9  ;Get the sector number, shift divide by 512
   add ax,[_0Eh_reservedSects]  ;Get to first actual FAT sector in LBA
    xor di,di
    inc di
    inc di   ;Specify 2 sectors to read, in the correct FAT sector, load at
             ;segment 0x7E0

  ; Input:
  ;       EAX = LBA
  ;        DI = sector count
  ;        ES = segment
  ; Output:
  ;       EBX high half cleared
  ;       DL = drive # number
  ;       EDX high half cleared
  ;       ESI = 0
  ; Clobbered:
  ;       BX, CX, DH, BP
  ;;
   call read_sectors             ;CALL to AHEAD address

 ;Get 2 raw cluster bytes:
  pop si  ;Get multiplied cluster
  pop ax  ;Get original cluster
  and si,111111111b ;Limit SI buffer offset to first 512 bytes
   mov si,[es:si]  ;Access 1024-byte buffer with limitation above
   test ax,1      ;See if original cluster number is even or odd
   jz .evenClustNum
    shr si,4       ;If odd, just keep the higher 4 bits (discard first 4 bits)
    jmp short .DoneAdjustClustNum
   .evenClustNum:
    and si,0x0FFF  ;If even, just keep the lower/first 12 bits
   .DoneAdjustClustNum:


 ;Save the new cleaned-up cluster value:
 ;;
  mov [_CurrFileClust],si

  jmp short .FileReadLoop
  eof:


 ;;END:  Final processes and pass control to the kernel
 ;;END:  Final processes and pass control to the kernel
 ;;END:  Final processes and pass control to the kernel



___________________________________________________
___________________________________________________
___________________________________________________
9. Start the Bootloader with a Simple Far Jump

We will still be running in Un/Real Mode after jumping to our "kernel", and we loaded it to physical address 0x700, but we want to use offsets with base address of 0, so it's the segment that contains the value 0x700 | 700h.

Code: Select all

;9. Jump to the 16-bit Real Mode bootup image
;   (it's intended to jump to 70h:0000h or 700h physical):
;;

 ;Now jump to the kernel image we loaded into
 ;address 0x500, 0x600 or 0x700 physical (just like DOS):
 ;;
  jmp _kern16seg:0



___________________________________________________
___________________________________________________
___________________________________________________
___________________________________________________
13. Program Variables

Most of the program variables are defined as a pointer and not as an actual reserved set of bytes.

The exception is _FileBuffSegment, because it needs to be initialized with the value 0x70, since it's more efficient in size to initialize it in this way than at run time, and the same goes for the other variables.

Code: Select all

;13. Program variables here:
;;

;All of the variables below are 16-bit.
;
;They are outside of the boot code to make
;more room and only _FileBuffSegment is better off
;if we define and initialize it:
;;
 _kern16seg         equ 0x70
 _FileBuffSegment   dw _kern16seg

 _RootDirSect       equ 8200h+0
 _RootDirSectCount  equ 8200h+2
 _CurrFileClust     equ 8200h+4
 _ClustAreaSect     equ 8200h+8


Image

Understanding a Good Generic FAT12 Floppy Boot Sector

Posted: Fri May 20, 2016 10:42 pm
by ~
Image

Writing a Test 16-Bit Un/Real Mode Kernel Image

Now what is left is creating a test image to run. We will fill it with NOPs to see if we can read as many sectors as we want. Also, if you want, you can write lots of files in your floppy image until it gets fragmented to see if the boot sector can still load the system properly (it should no matter what and how it's stored on the disk).

This code simply points to the start of the 80x25 text screen, and continuously increases the character shown at the topmost leftmost corner to prove that it was actually loaded (the loop is around 16384 bytes apart from the start of the binary file):

bootkern.asm

Code: Select all

bits 16
org 0h
START:


times 16383 db 0x90



mov ax,0xB800
mov ds,ax
xor si,si

.l:
inc byte[si]
; hlt
jmp .l




db "BootKernSignature"



_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
Assembling and Installing the Boot and the Kernel on the Floppy

Assemble the Un/Real Mode 16-bit boot kernel with:
nasm bootkern.asm -o bootkern.bin


Also assemble the boot sector with:
nasm BootKern__FAT12_BootSect16.asm -o BootKern__FAT12_BootSect16.bin


Copy the boot kernel to the floppy disk, and write the boot sector with rawrite (Windows) or dd (Linux)


_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
Running the Test

The only thing needed is downloading the test bootable floppy image, mount it (with VFDWin, OSFMount, or with the capabilities of an emulator like Bochs or VirtualBox) and run it. It's preferable to mount the floppy image in a way that will let copy files and write the boot sector as if it was a real hardware floppy disk.

The final result, the bootable floppy image can be downoaded from:
0000__LowEST_Kern__LEVEL_1__StartTest16.img


The full source code and tools to assemble this boot sector test, create the floppy image, write the boot sector and mount the floppy image can be downloaded from:
0000__LowEST_Kern__LEVEL_1__StartTest16.zip


It really shouldn't crash any computer or emulator.


What should be done first within the loaded boot image are paging structures and randomness (or loading contiguous free pages skipping reserved memory areas and already-used pages) to test whether the paging subsystem can really handle fragmented memory and relate virtual and physical addresses, but that will have to be explained in its own tutorial.

Image

Re: Understanding a Good Generic FAT12 Floppy Boot Sector

Posted: Fri May 20, 2016 10:50 pm
by b.zaar
How many bytes did your boot sector assembly down to? I can't download the source from your archive.org link.

Re: Understanding a Good Generic FAT12 Floppy Boot Sector

Posted: Fri May 20, 2016 10:52 pm
by ~
b.zaar wrote:How many bytes did your boot sector assembly down to? I can't download the source from your archive.org link.
Only 1 free byte out of 512. I will append the source above.

Re: Understanding a Good Generic FAT12 Floppy Boot Sector

Posted: Sat May 21, 2016 2:23 am
by Octocontrabass
~ wrote:I added code to see if the A20 line is already enabled.
Even if A20 is already enabled, it's possible for both locations to contain the same value. You must change one and check if they both changed in order to see if A20 is enabled.
~ wrote:I will try to make it turn on since it suffered a battery leak decades ago and it has kept corroding.
If the motherboard is that bad, you could take the BIOS chip out of its socket and use an EEPROM programmer or something to read its contents. If you don't have access to any tools to read the ROM, you could send the chip to me, but you'll have to package it appropriately to protect it from damage during shipping.

Re: Understanding a Good Generic FAT12 Floppy Boot Sector

Posted: Sat May 21, 2016 3:14 am
by ~
Image
Octocontrabass wrote:
~ wrote:I added code to see if the A20 line is already enabled.
Even if A20 is already enabled, it's possible for both locations to contain the same value. You must change one and check if they both changed in order to see if A20 is enabled.
Doing more tests than that would probably start adding too much bloat unless some machines are found having that particularity, so until then at least it's all public domain and open source so it's nothing to worry about unless it proves to be an usual case capable of inducing actual failures.

It would probably be extremely difficult to find a machine like that, with A20 enabled and still containing precisely 0xAA55 at precisely those two locations, and such bogus case would only have the effect of attempting to enable the A20 line again. That would hardly be a frequent or destructive combination.

If that happens, reading or writing only a few bytes would no longer be a robust test and a best test could probably be comparing two whole contiguous Megabytes in an even-odd Megabyte pair. If it all is the same, then it should be safe to say that the A20 line is disabled. We could probably use 32-bit offsets directly from Unreal Mode to test several Megabytes without writing anything beyond the normal variable writes in their original addresses at the first, even Megabyte.

We would also need to disable interrupts and avoid any stack usage during the test (we will also be testing the contents of the stack during the compare).

One thing that we could do could be to have the usual memory variables, and if we see that the first, even Megabyte still contains exactly the same than the second, odd Megabyte, and if we keep making several full tests across the execution of the bootloader (for example across major tasks), as long as they keep containing the same (even if we are only making certain changes to the first, contiguous even Megabyte), we could say that the A20 line is disabled, until they stop containing the same.

We could attempt a different enabling method every time a Megabyte comparison determines that the even-odd Megabyte pair is accessing the same contents.

At the end we could take note of what the successful A20-enabling method was, and write a configuration file to use it as the first method to try the next time we boot.

We would need to write the Megabyte we know that is accessible and test the other one that could be currently inaccessible. In other words, when doing this test we should rely on manipulating the Megabyte that we know that will keep the changes in memory (the even) and then compare with the other unsure one (the odd).

Otherwise we could expect a poorly-implemented test. I say that it would probably be better to avoid writing memory at all, first to prevent data destruction and to have a cleaner and more generic code, and second because writing memory could also give a false positive if CPU cache is involved, or if the hardware memory controller happens to hold data temporarily when there is no RAM present and we write that.


But as I said, this whole test would only be profitable to implement after having a big system that reaches many different machines and only if it proves to give a real benefit. Before that, we do more good by keeping this method reserved in the documentation (even some unused snippets in tutorial appendix documents) and let the new developers and everyone else read only the most vital code, making sure that they will be able to report or fix any issue if they happen to run into real, non-supposed booting problems. A shorter code, along with more documentation and likely solution options, will also help them determine where the problem could be too, and learn well in the process as was the case with operating systems in past iterations/generations.

Octocontrabass wrote:
~ wrote:I will try to make it turn on since it suffered a battery leak decades ago and it has kept corroding.
If the motherboard is that bad, you could take the BIOS chip out of its socket and use an EEPROM programmer or something to read its contents. If you don't have access to any tools to read the ROM, you could send the chip to me, but you'll have to package it appropriately to protect it from damage during shipping.
I'll try to make it work again and I'll try to copy the whole first Megabyte taking note of the currently installed devices (VGA card, sound card, disk controllers...).


Image

Re: Understanding a Good Generic FAT12 Floppy Boot Sector

Posted: Sat May 21, 2016 5:33 pm
by Octocontrabass
~ wrote:If that happens, reading or writing only a few bytes would no longer be a robust test and a best test could probably be comparing two whole contiguous Megabytes in an even-odd Megabyte pair.
If a machine happened to boot with the same data at both locations you check, there is no test more robust than attempting to change one and seeing if the other changes too.

Performing tests outside the first two megabytes can falsely report that A20 is enabled on some motherboards. Since the A20 gate only needs to affect the first two megabytes, some motherboards implement it in a way that only affects the first two megabytes, and the rest of the physical address space behaves as if A20 is always enabled.
~ wrote:I say that it would probably be better to avoid writing memory at all, first to prevent data destruction and to have a cleaner and more generic code, and second because writing memory could also give a false positive if CPU cache is involved, or if the hardware memory controller happens to hold data temporarily when there is no RAM present and we write that.
Your bootloader knows where everything is. It's easy to avoid destroying any data.

If the cache interferes with detecting the A20 status, DOS will fail. Any PC old enough for caches to be a concern is also old enough that no one would buy a PC that couldn't run DOS.

Nonexistent RAM is not a concern. You only care if writing in the second megabyte affects the first megabyte. It doesn't matter what happens to the data you've written to the second megabyte if it doesn't affect the first megabyte.