Reliable methods for enabling A20.

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
User avatar
elderK
Member
Member
Posts: 190
Joined: Mon Dec 11, 2006 10:54 am
Location: Dunedin, New Zealand
Contact:

Reliable methods for enabling A20.

Post by elderK »

Hey all,

As you can tell from my recent posting (The new Stage1 bootloader), I am donig a major overhaul of how my Kernel boots. Mainly, because I am wanting to give the new (super-redesigned) Kernel a much more reliable foundation, or at least a cleaner foundation.

So, with that - I ask what methods I should walk through for enabling the A20 line?

Also, I will ask what methods I should walk through, to obtain a reliable count of system memory.

These methods I attempt for A20:
- Query A20 Capability, 0x2403, int 0x15.
- On failure, attempt conventional i8042 Keyboard method.
- If that fails, try FAST-A20 and pray...

- If the 2403 call succeeded, use the method retrieved.
(i8042 or Fast A20...)

I really should add some testing ...

These methods I use for Memory:
- Memory map 0xe820 int 0x15
- Memory size, 0xe881 int 0x15
- Memory size, 0xe801 int 0x15
- memory size, 0x0088 int 0x15
- if all fail, a probe.

Im hoping to cover my butt in most cases :).
Feedback would be helpful1

~Z
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Reliable methods for enabling A20.

Post by Brendan »

Hi,
zeii wrote:So, with that - I ask what methods I should walk through for enabling the A20 line?
For A20, I use the following:
  • - test if A20 is already enabled and return if it is
    - try the BIOS function for enabling A20 and return if it worked
    - try the "Fast A20" method
    - test if A20 is enabled and return if it is
    - try the keyboard controller method
    - repeatedly test if A20 is enabled for 1 second, and return if it is
    - do an error message and abort the boot
zeii wrote:Also, I will ask what methods I should walk through, to obtain a reliable count of system memory.
I'm just finishing a rewrite of my code, and consider it the best code possible (yeah, I'm modest :))....

First, I should point out that the purpose of my code is to build a conservative map of the physical address space (similar to what Int 0x15, eax=0xE820 returns), so that I can find out where RAM is, but also so that I can use it for other things (like figuring out safe areas to use for memory mapped PCI devices, which areas I can reclaim when I'm finished with ACPI tables, etc).

To begin with, I have an empty list of physical address space areas and a function that inserts a new entry into this list. This list contains an entry for each area (start address, length, type and flags). The type of the area is the same as defined by the ACPI specification (version 3, which includes the "faulty RAM" area type) for values that are 5 or lower. I make up my own area types for values that are higher than 5 (there's only one - a "mixed" area type). The flags field is used to store the area flags defined by ACPI version 3 (even though they've only really defined 2 flags so far, and one of the is entirely useless).

The function to add an entry to the list of physical address space areas checks the area type and changes it to zero (unknown) for any type that is undefined (greater than 5) and ignores any area with zero length. It ensures that the list of areas is sorted (from lowest starting address to highest starting address). It also combines adjacent areas of the same type into a single entry, converts overlapping areas of the same type into a single entry, and converts overlapping areas of different types into several areas (where a new area with the "mixed" area type is created that corresponds to the area that overlaps).

For detection, naturally I start with "Int 0x15, eax = 0xE820". Here I've got 2 seperate routines (one for the 24 byte data structure returned for ACPI 3 and another for the normal 20 byte data structure). These routines just get each area from the BIOS and call my function to add the area to my list. The first version fails if 24 bytes aren't returned or if the BIOS doesn't support it, while the second version only fails if the BIOS didn't support it.

If "Int 0x15, eax = 0xE820" didn't work then I start building my own physical address space map. First I add an area for RAM starting at 0x00000000 (the size of this area is determined by the boot loader - usually from Int 0x12, except for the netboot/PXE boot loader which gets it from the network card's ROM code when it unloads the networking stack, as Int 0x12 can be wrong in this case). I also add an area from the end of the first area to 0x00100000 as the "system" type (to cover the EBDA, ROMs, etc). This means I only need to worry about things above 0x00100000 after this.

Next I try "Int 0x15, ax = 0xE881" then "Int 0x15, ax = 0xE801". Both of these are similar, and return the amount of RAM between 0x00100000 and 0x01000000; and the amount of RAM above 16 MB - I just add these areas to my list (not caring if they're zero length or both contiguous, as my function to add an area to the list sorts that out). In this case I also add a "system" area for the area between 0xFEC00000 to 0xFFFFFFFF to make sure things like APICs and the BIOS are included in my physical address space map.

If both of them didn't work I try "Int 0x15, ah = 0x8A" then "Int 0x15, ax = 0xDA88". Both of these are similar and return the amount of contiguous RAM starting at 1 MB. For these, if ther BIOS says there's more than 512 MB there I assume the BIOS is wrong and and ignore it (any computer that supports more than 512 MB should also support better BIOS functions). If the BIOS says that there's 14 MB at 0x00100000 then I assume that the value returned has been limited because of a hole for memory mapped ISA devices and use manual probing for any memory above 0x01000000. In any case I add the RAM areas to the list, and also add a "system" area for the area between 0xFEC00000 to 0xFFFFFFFF.

If nothing else has worked yet I try "Int 0x15, ah = 0x88". In this case if AH is 0x80, 0x86 or 0x88 then I assume the function failed (and that there isn't 32 MB at 0x00100000 for e.g.). If the BIOS says there's 15 MB or 14 MB at 0x00100000 I probe for more RAM above 0x01000000; and if the BIOS says there's 63 MB or 65535 KB at 0x00100000 I probe for more RAM above 0x04000000. I add the RAM areas to the list and also add my "system" area at 0xFEC00000.

If nothing else has worked, I try CMOS locations 0x17 and 0x18. These should contain a 16-bit "number of KB at 0x00100000". If this value is zero my code ignores it. Otherwise, if CMOS says there's 14 MB at 0x00100000 I probe for more RAM above 0x01000000; if the CMOS says there's 15 MB at 0x00100000 I ignore the last 1 MB of it (in case the CMOS contains "installed RAM" rather than "usable ram that's not trashed by an ISA hole") and then probe for more RAM above 0x01000000; and if the BIOS says there's 63 MB or 65535 KB at 0x00100000 I probe for more RAM above 0x04000000. I add the RAM areas to the list and also add my "system" area at 0xFEC00000.

If everything else fails, I manually probe. In this case I first probe for RAM between 0x00100000 and 0x0EFFFFF and then probe from RAM between 0x01000000 and 0x1FFFFFFF. I skip 1 MB at 0x0F000000 in case there is an ISA hole. I add the RAM areas to the list and also add my "system" area at 0xFEC00000.

Lastly, my code handles all bugs mentioned in Ralph Brown's Interrupt list for all BIOS functions I use. Also, I record which method was used to detect physical address space so the user can find out later if things went wrong; including where my code probed (if it probed). The values I use for the detection method include values for "Int 0x15, ax = 0xE802" and "Int 0x15, ah = 0xC7" even though my code doesn't use these functions (yet) due to poor documentation. This gives me 17 different values (e.g. CMOS, CMOS with probing at 16 MB, CMOS with probing at 64 MB, etc).

In the end, regardless of how buggy the BIOS is and what it supports, I should get a nice clean sorted list of areas with no overlapping, no adjacent areas of the same type, etc.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
User avatar
elderK
Member
Member
Posts: 190
Joined: Mon Dec 11, 2006 10:54 am
Location: Dunedin, New Zealand
Contact:

Post by elderK »

Sounds cool.
I would like to know more about the A20 wrap point though.
What address do I try and access, to see if Memory has wrapped.
IE: If A20 is disabled?

A20 was around to wrap 1MB back to 0 address, right?
~Z
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Post by Brendan »

Hi,
zeii wrote:I would like to know more about the A20 wrap point though.
What address do I try and access, to see if Memory has wrapped.
IE: If A20 is disabled?

A20 was around to wrap 1MB back to 0 address, right?
If A20 is disabled, everything that tries to access the second MB (from 0x00100000 to 0x001FFFFF) will actually access the first MB (from 0x00000000 to 0x000FFFFF). The same happens for all of the physical address space (attempting to access any odd MB will actually access the previous even MB, e.g. trying to access something at the address 0x12345678 will actually access data at the address 0x12245678).

My code to test if A20 is enabled is:

Code: Select all

.testA20:
        xor ax,ax
        mov es,ax                       ;es = 0x0000
        dec ax                          ;ax = 0xFFFF
        mov ds,ax                       ;ds = 0xFFFF

        wbinvd
        mov bx,[es:0]                   ;bx = [0x0000:0x0000]
        cmp bx,[0x10]                   ;Is it the same as the word at 1Mb?
        jne .exitOk                     ; no, A20 works
        inc word [0x10]                 ;Change the word at 1Mb
        wbinvd
        cmp bx,[es:0]                   ;Did it change the word at 0 too?
        je .exitOk                      ; no, A20 works
        stc
        ret

.exitOk:
        clc
        ret
This is designed for real mode though - it'd be easier for 32-bit protected mode.... ;)


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
User avatar
elderK
Member
Member
Posts: 190
Joined: Mon Dec 11, 2006 10:54 am
Location: Dunedin, New Zealand
Contact:

Post by elderK »

How widespread are those BIOS Routines for A20? 0x240x?

~Z
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Post by Brendan »

Hi,
zeii wrote:How widespread are those BIOS Routines for A20? 0x240x?
Honestly, I'm not too sure - I wrote my code a long time ago and would've tested the BIOS part on something at the time, but since then it's always worked and I've never really looked into how often the BIOS works, how often FastA20 works and how often the keyboard controller works.

I'd assume most BIOSs support it though - it's probably been around since 80386 and A20 is a bit of a mess without it.

I inserted a "jmp $" after the code that uses the BIOS function to enable A20 and tried it on Bochs, Virtual PC and 2 real computers, and it didn't lock up on any of them. This means that either A20 was enabled before my boot code started or the BIOS function worked.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Post Reply