Page 3 of 3

Re: Understanding a Good Generic FAT12 Floppy Boot Sector

Posted: Sun May 22, 2016 5:21 pm
by ~
Octocontrabass wrote:
~ wrote:If that happens, reading or writing only a few bytes would no longer be a robust test and a best test could probably be comparing two whole contiguous Megabytes in an even-odd Megabyte pair.
If a machine happened to boot with the same data at both locations you check, there is no test more robust than attempting to change one and seeing if the other changes too.

Performing tests outside the first two megabytes can falsely report that A20 is enabled on some motherboards. Since the A20 gate only needs to affect the first two megabytes, some motherboards implement it in a way that only affects the first two megabytes, and the rest of the physical address space behaves as if A20 is always enabled.
I guess that the BIOS changing 7C00h-7DFFh is enough writing test and it's implicit (it's virtually impossible to have the same memory image there and 1 Megabyte above). The other test would be to test the even/odd Megabyte with interrupts disabled every time we end major tests and assume that the A20 line is enabled as soon as both Megabytes become different and stop trying to enable it (and always test for that before trying to enable it again right after jumping to the binary image).

What would be more important is to compare only user-space memory, not ROM, BIOS or device-mapped memory because that will make the tests fail if any device changes its state.
Octocontrabass wrote:
~ wrote:I say that it would probably be better to avoid writing memory at all, first to prevent data destruction and to have a cleaner and more generic code, and second because writing memory could also give a false positive if CPU cache is involved, or if the hardware memory controller happens to hold data temporarily when there is no RAM present and we write that.
Your bootloader knows where everything is. It's easy to avoid destroying any data.

If the cache interferes with detecting the A20 status, DOS will fail. Any PC old enough for caches to be a concern is also old enough that no one would buy a PC that couldn't run DOS.

Nonexistent RAM is not a concern. You only care if writing in the second megabyte affects the first megabyte. It doesn't matter what happens to the data you've written to the second megabyte if it doesn't affect the first megabyte.
The right way to do the A20 enabling test would be to always write to the first Megabyte. In this way if you get a different result in the second Megabyte, the only reason could be that the A20 line is enabled (or that there's just 1 Megabyte of memory, but in that case we couldn't boot a big system anyway).

The right way is to write to existent RAM.


If you try to test the A20 line based on writing specifically to the second Megabyte, you won't know if it doesn't change the first Megabyte as well because the A20 line is enabled, or because the second Megabyte doesn't exist.


As can be seen, if we write the even Megabyte and then compare against the odd Megabyte, we are always certain that we use the most stable Megabyte as a pinpoint. We get rid of the uncertainty of nonexistent RAM and cache/stray values problems.

If by writing the even Megabyte the odd Megabyte becomes different, then that's the only certain and clean, stable test we are left with, that the A20 line is enabled.

Re: Understanding a Good Generic FAT12 Floppy Boot Sector

Posted: Mon May 23, 2016 12:52 am
by Octocontrabass
~ wrote:If you try to test the A20 line based on writing specifically to the second Megabyte, you won't know if it doesn't change the first Megabyte as well because the A20 line is enabled, or because the second Megabyte doesn't exist.
When the A20 line is disabled, writes to the second megabyte always affect the first megabyte. It doesn't matter if there's RAM or not in the second megabyte.

On the hardware side of things, the A20 line is disabled using a hardware AND 0xFFEFFFFF on the address lines between the CPU and the cache. When you access the second megabyte with A20 disabled, all hardware outside the CPU sees the CPU accessing the first megabyte. It doesn't matter how much RAM is present because the RAM never sees the original address, only the modified address. It doesn't matter how stable the cache is because the cache never sees the original address, only the modified address. (CPUs with internal cache implement the hardware AND internally, between the CPU core and the cache. Intel calls this "A20M#".)
~ wrote:If by writing the even Megabyte the odd Megabyte becomes different, then that's the only certain and clean, stable test we are left with, that the A20 line is enabled.
If you are writing to the first megabyte and reading the second megabyte, you might be reading stray values (if there is no RAM in the second megabyte and the A20 line is enabled). This is exactly the problem that you've been trying to avoid!

Re: Understanding a Good Generic FAT12 Floppy Boot Sector

Posted: Mon May 23, 2016 1:11 pm
by ~
Octocontrabass wrote:
~ wrote:If you try to test the A20 line based on writing specifically to the second Megabyte, you won't know if it doesn't change the first Megabyte as well because the A20 line is enabled, or because the second Megabyte doesn't exist.
When the A20 line is disabled, writes to the second megabyte always affect the first megabyte. It doesn't matter if there's RAM or not in the second megabyte.

On the hardware side of things, the A20 line is disabled using a hardware AND 0xFFEFFFFF on the address lines between the CPU and the cache. When you access the second megabyte with A20 disabled, all hardware outside the CPU sees the CPU accessing the first megabyte. It doesn't matter how much RAM is present because the RAM never sees the original address, only the modified address. It doesn't matter how stable the cache is because the cache never sees the original address, only the modified address. (CPUs with internal cache implement the hardware AND internally, between the CPU core and the cache. Intel calls this "A20M#".)
Octocontrabass wrote:
~ wrote:If by writing the even Megabyte the odd Megabyte becomes different, then that's the only certain and clean, stable test we are left with, that the A20 line is enabled.
If you are writing to the first megabyte and reading the second megabyte, you might be reading stray values (if there is no RAM in the second megabyte and the A20 line is enabled). This is exactly the problem that you've been trying to avoid!
If we only write existent RAM we avoid stray values.

If we read existent RAM positions into a general-purpose CPU register and then compare it with the same address but 1 Megabyte apart, we are also avoiding stray values.

If we avoid performing dummy writes and depend on the writes that are really used to modify our real variables and stack, we just won't be getting problems from potential stray values anymore.

But we will be reading several Kilobytes of user-space memory (avoiding ROM and VGA memory of course), with interrupts disabled and without using the stack during the test (using only the CPU registers to avoid modifying any memory at all).

We will include the stack and the in-program image and variables that change constantly at runtime, so we will always get an unique footprint in the first Megabyte. A definitely optimum test since we use our unique program image and runtime values to test the other Megabyte.

The test will be best if we align it to only write to RAM that we know exists (because we are running from it) AND test for values and areas that we know that are changing in our real application. In this way we aren't using but implicit and functional changes as the basis of testing the second, odd, probably disabled, inaccessible or non-existent Megabyte.

We can also start getting randomness for a random seed from this, so it's a test that will be better left as clean and safe as possible.

From there we are making sure to add more robustness to the integrity of our first Megabyte, whereas writing the second, unsafe Megabyte, has no useful tasks that we can escalate, so writing it explicitly for this test would be a poorer option.

*Writing* explicitly RAM that might not be accessible for any reason might have nonstandard and unpredictable effects. That doesn't happen when limiting *writes* only to absolutely safe and present RAM.

Writing code that depends on explicitly writing potentially non-existent RAM is much more prone to get replaced by better-implemented code, and is much dirtier and much more confusing and unmaintainable than code that is aligned to instead just keep the sanity of the resources that actually exist and push out the noise of the absent resources, by providing shielding of the system from such stray noise with the most stable/simplified/complete/straightforward/clear usage of the standard resources available in the current machine.

Understanding a Good Generic FAT12 Floppy Boot Sector (Updat

Posted: Mon May 23, 2016 2:32 pm
by ~
Image

Booting DOS Games Directly

I have added a good degree of usefulness and capability to use this boot sector by making it possible to boot COM files directly. Several DOS COM programs boot and run, and several don't. Some run but don't respond to the keyboard.

It will be good to find out which existing DOS programs are the simplest and make little or no use of third-party services (BIOS and above) to the point where they can boot and run without OS; group them, and learn from them.

Probably the authors of those games didn't want to buy a DOS license and just created the program so it could boot as a stand-alone gaming system or program, or under DOS.


_____________________________________________________________________
_____________________________________________________________________
_____________________________________________________________________
_____________________________________________________________________
Configuring Bochs to Run at the Speed of an 8088

To run the old games, it will most likely be necessary to slow down the emulator. For Bochs, it can be done like the following (adjust the value if it's too fast or too slow to be playable):

Code: Select all

# set up IPS value and clock sync
cpu: ips=100000
clock: sync=slowdown




_____________________________________________________________________
_____________________________________________________________________
_____________________________________________________________________
_____________________________________________________________________
Boot Sector/Floppy Image Update

The new boot code is here:
LowEST_Basics/Volume_0001/BootKern__FAT12_BootSect16.asm



This is the first floppy image I have done based on it, in the following URL, much more interesting than other previous examples I have made:
LowEST_Basics_Volume_0001.img


It contains several COM programs that you can also assemble with NASM (there are old games without the source code, only the binary).

All you have to do is copy the COM file you want to test and rename it to BOOTKERN.BIN in the root directory.




_____________________________________________________________________
_____________________________________________________________________
_____________________________________________________________________
_____________________________________________________________________
Included Programs

I plan to stop including programs to this floppy image or other in the future until there's no more room left to include them and copy them to the root directory for booting:

Bootable Games without DOS
- Cross Fire
- Pac Man
- Paratrooper
- Zaxxon

Bootable Demos without DOS
- 16-bit boot program example that prints a string to text memory ("Hello Z86!!!")
- 2D and 3D rotation of simple points (BIOS 13h 320x200x256)
- "Jans Flame", "Jan's Flame", "jansflame". Fire effect (BIOS 13h 320x200x256)

Currently Creating: Direct standard VGA programming demonstration for modes 3h, 4h, 12h, 13h, and Mode X (not finished, doesn't do anything yet).




_____________________________________________________________________
_____________________________________________________________________
_____________________________________________________________________
_____________________________________________________________________
How to Boot COM Programs Without DOS

They have to support it, but we must also boot to an address compatible with DOS COM programs assembling to 100h.

I just adjusted the segment:offset address. I load the program binary at 700h physical, so to use it from base address 100h like a common COM DOS program, I just needed to do a far jump to 600h:100h:

Code: Select all

;9. Jump to the 16-bit Real Mode bootup image
;   (it's intended to jump to 70h:0000h or 700h physical):
;;

;Now jump to the kernel image we loaded into
;address 0x500, 0x600 or 0x700 physical (just like DOS).
;
;Here we will jump to a segment:offset address compatible
;with DOS COM programs loaded at 100h.
;
;With this, we will be able to boot directly into
;many old games and demos in COM format
;assembled for DOS segment 100h:
;;
 jmp _kern16seg_minus_100h:100h


Code: Select all

;13. Program variables here:
;;

;All of the variables below are 16-bit.
;
;They are outside of the boot code to make
;more room and only _FileBuffSegment is better off
;if we define and initialize it:
;;
_kern16seg             equ 0x70
_kern16seg_minus_100h  equ 0x60
_FileBuffSegment       dw _kern16seg

_RootDirSect       equ 8200h+0
_RootDirSectCount  equ 8200h+2
_CurrFileClust     equ 8200h+4
_ClustAreaSect     equ 8200h+8



I guess that by digging these details it will be much easier to identify the simplest programs around for 16-bit.

An by the way, all of those games and demos are obviously independent from any DOS services for being able to run, and probably independent from some BIOS services (using direct hardware instead).

Image

Re: Understanding a Good Generic FAT12 Floppy Boot Sector

Posted: Mon May 23, 2016 2:56 pm
by Octocontrabass
~ wrote:If we only write existent RAM we avoid stray values.
Stray values are what you get when you read unmapped addresses (i.e. nonexistent RAM). What did you think they were?
~ wrote:*Writing* explicitly RAM that might not be accessible for any reason might have nonstandard and unpredictable effects.
When you write to unmapped addresses, nothing happens. When you write to the second megabyte while the A20 line is disabled, it's exactly the same as if you wrote to the same offset of the first megabyte. These behaviors are standard and predictable.

All of your proposed tests for A20 are extremely complicated for something that can be done in a few simple steps:
  1. Compare a byte in the first megabyte with a byte at the same offset in the second megabyte. If they're different, the A20 line is already enabled and you're done.
  2. The two bytes you read were the same. Pick a value different from the one you read and write it to the same offset in the second megabyte.
  3. Read the byte from the first megabyte again. If it stayed the same, the A20 line is enabled and you're done.
  4. The byte in the first megabyte was affected by your write to the second megabyte. Enable the A20 line.

Re: Understanding a Good Generic FAT12 Floppy Boot Sector

Posted: Mon May 23, 2016 3:02 pm
by ~
Image
Octocontrabass wrote:
~ wrote:If we only write existent RAM we avoid stray values.
Stray values are what you get when you read unmapped addresses (i.e. nonexistent RAM). What did you think they were?
~ wrote:*Writing* explicitly RAM that might not be accessible for any reason might have nonstandard and unpredictable effects.
When you write to unmapped addresses, nothing happens. When you write to the second megabyte while the A20 line is disabled, it's exactly the same as if you wrote to the same offset of the first megabyte. These behaviors are standard and predictable.

All of your proposed tests for A20 are extremely complicated for something that can be done in a few simple steps:
  1. Compare a byte in the first megabyte with a byte at the same offset in the second megabyte. If they're different, the A20 line is already enabled and you're done.
  2. The two bytes you read were the same. Pick a value different from the one you read and write it to the same offset in the second megabyte.
  3. Read the byte from the first megabyte again. If it stayed the same, the A20 line is enabled and you're done.
  4. The byte in the first megabyte was affected by your write to the second megabyte. Enable the A20 line.
This test isn't complicated, it's the exact same thing that is written here in the boot sector code. The code is much shorter and more commented than the one in the Wiki, and still works smoothly.

To make it more optimum we could test the place in memory for the variable that is supposed to change the most in the program. That one could be one of last 4 DWORDs in the stack top and compare them with the values at the other Megabyte. To make it simpler we could simply test the second DWORD in the stack top with the same address 1 Megabyte above.

Probably the most reasonable thing to expect is that all "old" motherboards support the KBC enabling method for the A20. Also some that already have the A20 enabled.

Probably another mostly reasonable thing to also expect is that the newest machines which no longer have PS/2 devices but only USB (which can even ship with BIOS and UEFI) already have the A20 line enabled, so this test would skip the KBC enabling of the A20.

If this test ever fails (which only time will tell, and by then we will have written and done a lot) all we have to do is update the source code, and there's no problem, and we will have also kept a minimalist and easy to understand code base without anything that we don't really need or of which we aren't 100% sure in every single aspect (no additions just for fear).


So for machines prior to UEFI and USB-only, we can safely expect that the KBC enabling method is the standard method to be found everywhere since the 386's. And for newer machines without ISA devices, we can safely expect that if the PS/2 KBC isn't emulated (which should to finish standarizing or at least leave implemented as a device that only enables the A20 for the oldest x86 binaries in existence), that the A20 line is already enabled.

All of this backwards-compatibility makes me think that there are developers somewhere that still use 16-bit code from the 80's, and that make programs and tests on the raw hardware with the idea that if they aren't going to need much more to perform a task and if they can execute it effectively without an OS, why not just boot it and see the results without other things running that are unknown (from closed source software, for example)? If they use 32 or 64-bit code, they are also to implement it directly on raw hardware.

So there's no worry. If we don't find the most ideal detection from the start, we will get to find it by trying countless times and fixing runtime problems into a library of variants, while creating our practical tests that use OS concepts but for other levels and topics of investigation, using raw hardware and cumulative knowledge and experience, at OS topics and at the topics of the applications to investigate.

Image