NASM Data Skipping

Octacone · Post by **Octacone** » Sat May 26, 2018 2:57 pm

Hello, I need some help.
Since I realized it would be impossible to explain what I want, here is an image:

alexfru · Post by **alexfru** » Sat May 26, 2018 3:20 pm

Confirmed. I can't understand what you want.

Octocontrabass · Post by **Octocontrabass** » Sat May 26, 2018 3:32 pm

So, you're writing a FAT32 bootloader and you want NASM to automatically insert the JMP instruction so control flow can proceed from the first sector to the second sector? I don't think NASM can do that. I've never heard of any assembler being able to do something like that.

In my FAT32 code, I inserted the JMP manually. That way, the code responsible for loading the second sector can't accidentally end up in the second sector.

Brendan · Post by **Brendan** » Sat May 26, 2018 8:06 pm

Hi,

Octacone wrote:How to make NASM think there is no important data between these sectors while preserving that data.

Depending on where the "preserved data" comes from, and assuming you're using the "flat binary" output file format in NASM (which has support for custom defined sections); maybe something like (untested):

Code: Select all

;Set up sections

	SECTION .sector1 progbits start=0x0000 vstart=0x7C00
	SECTION .preservedArea progbits start=0x01BE vstart=0x7DBE
	SECTION .text progbits start=0x0240 vstart=0x7E40
	SECTION .data progbits follows=.text vfollows=.text
	SECTION .bss nobits follows=.data

;Copy the "preserved" data from the floppy disk into the corresponding section of the output file

    section .preservedArea
    incbin "/dev/fd0", 0x01BE, 0x0240-0x01BE

;Initialisation code (in first sector)

    section .sector1
    jmp START


START:
    xor ax,ax
    mov ds,ax
    mov es,ax
    cli
    mov ss,ax
    mov sp,0x7C00
    sti
    cld

    jmp $

;More stuff (starting after the preserved area in the second sector)

    section .text

The basic idea here is to copy the preserved data into the file that NASM generates, so that you can overwrite the original preserved data on the disk with data that is identical (e.g. "cp myFile.bin /dev/fd0").

A better idea would be to fill the area with anything you like (e.g. zeros) in NASM (without preserving the data); and then use "dd" or write a custom installer utility to only copy the needed pieces of the file NASM generated into the right places on the disk (and not copy/overwrite the preserved data that was already on the disk).

Cheers,

Brendan

iansjack · Post by **iansjack** » Sun May 27, 2018 1:38 am

If I understand your requirements, the actual jump is trivial (presuming you know the size of the reserved information). The difficulty is writing your code to the disk without affecting the reserved data already there. As you can only write a sector at a time you have to, as Brendan says, read the original sectors from the disk insert the code without changing the reserved information, then write the modified sectors back to the disk. Other than that, I can't see any problem. The actual assembler that you are using is unimportant.

I don't think Brendan's "dd" suggestion is viable since the new code and preserved data are mixed in the sectors.

Octacone · Post by **Octacone** » Sun May 27, 2018 5:39 am

@alexfru I guess it will take some time to understand then, take a look at what others said.

@Octocontrabass Yes that is what I'm doing. The problem is where to insert that jump? How do I know where those 420 bytes end (within the file)?

@Brendan Preserved data = FAT 32 information, yeah I am using the flat binary. Your first idea seems complex for what I want to do. About your dd thing, I am doing so already. I can copy everything just fine, but I can't get execution to continue.

@iansjack Yes, I know its size. I don't have any problems with copying that stuff over, that was never questionable.

Hmm, maybe I didn't explain this enough:
Let's say I have file:

Code: Select all

Some_Random_Function:
mov ax, bx
mov bx, ax
mov cx, dx
ret

Some_Random_Function_2:
mov cx, dx
xchg bx, ax
shr ax, 2
ret

Some_Large_Function:
...a lot of code...

some_variable: db 00h
some_other: dd 00000000h
some_text: db "test 123", 0Ah, 00h
test_data: dq 0123456789ABCDEFh

Now, let's say Some_Random_Function + Some_Random_Function_2 + Some_Large_Function + some_variable take 420 bytes, but the entire file takes 790 bytes. Sector 1 will now contain the first 420 bytes and the second one will contain everything else that is left. But since there is a gap in-between those sectors, assembler doesn't know that and cannot access the other variables I declared and instead it will think that the important data I was talking about are those variables. Better?

Octocontrabass · Post by **Octocontrabass** » Sun May 27, 2018 7:52 am

Octacone wrote:The problem is where to insert that jump? How do I know where those 420 bytes end (within the file)?

All of this would be much easier if you included placeholders for that "important data" instead of trying to completely leave it out of your binary. You'll receive errors from the assembler when your code outgrows the space you've left between the placeholders, so you'll know when it's time to move stuff across the break.

If you're insistent on not having padding in your final binary, you'll have to use sections to create a binary that can be chopped to fit the disk appropriately. Here's an example:

Code: Select all

bits 16

section firstjump start=0 vstart=0x7c00 align=1 valign=1

    jmp short start ; NASM's optimizer doesn't know this can be a short jump

section bpb nobits vstart=0x7c03 valign=1

; use this section to define the BPB so you can access it using labels

section firstsector start=3 vstart=0x7c5a align=1 valign=1

start:
    jmp sector2

section firstsig nobits vstart=0x7dfe valign=1

    resw 1 ; for the 0xaa55 signature in the first sector

section secondsector start=423 vstart=0x7e00 align=1 valign=1

sector2:
    hlt

This is an awful lot of work to avoid having placeholders for the BPB and such. Are you really sure you don't want to insert placeholders?

Also, I've just looked at your diagram again and noticed that you're trying to use the FSInfo sector (X+1), which is reserved. Your second sector of code must go elsewhere on the disk (older Windows uses X+2, newer Windows uses X+12).

linuxyne · Post by **linuxyne** » Sun May 27, 2018 8:10 am

There are two instances of the code - one that resides on the disk, and the one that runs from memory.

The code is linked as if there are no breaks in between - i.e. ImportantData is ignored while linking. The code is then saved on the disk in two pieces. At the time of running, the pieces are joined back together into a single binary.

That requires the first piece (420 bytes) to be able to read other pieces and load them into memory. It is also required that the first piece not address any variables which lie beyond its limits /before/ it has had the chance to load the other pieces at proper locations.

One can divide the code into two pieces - the first piece consists of variables and a small amount of code which loads the second piece. The second piece can access variables which reside in the first piece as well as in itself.

I am not sure of the reason we insist on a chain of code pieces, connected with jmps, but residing on the hard disk.

Octacone · Post by **Octacone** » Sun May 27, 2018 8:20 am

@Octocontrabass

Placeholders? I didn't say/hear anything about placeholder. What do you mean by that. Like reserving some free space that acts like padding? But again how do I know where to put that space?
Wait wait wait... So you are saying that I am allowed to use X+2 and X+12? What!? This changes everything. If I could do so then none of this would exist, no problems.
That could potentially solve all my problems. Man, you are my savior.

@linuxyne

There is a separate loader that loads both of those sectors into memory, there is no need for them to load each other.
I don't like breaking code into peaces. Imagine having a function that has to jump to itself two times because it is not contiguous.

linuxyne · Post by **linuxyne** » Sun May 27, 2018 8:36 am

Octacone wrote:There is a separate loader that loads both of those sectors into memory, there is no need for them to load each other.
I don't like breaking code into peaces. Imagine having a function that has to jump to itself two times because it is not contiguous.

When you say that we do not like breaking code into pieces, I take it that we do not like to break code into pieces both at runtime and on-disk.

Suppose we break a single, contiguous flat binary, for the sake of storing on disk, such that each byte is stored on the start of a sector (i.e. an entire sector is utilized to store just a single byte). The loader can still piece together the binary and jump to its start address /once/.

Just because a function is broken into several pieces when stored on the disk, does not mean that the function has to run in the same piece-wise manner from the memory.

A function which has to jump to itself in the middle because it is not contiguous, only happens when the function is not contiguous /in memory/ or /at link-time/. But the image that you showed is about a function/binary which is broken only during its offline storage, and remains contiguous at link/run time.

I am trying to understand the situation, and this is what I came up with:

We insist that the link-time piece-representation of the binary must be the same as its store-time representation. That is, since we cannot avoid a gap in the storage of the binary, that gap must persist during the link-time (and therefore at run-time) too. This enables us to 'dumb' load the two sectors without the loader having to piece together the binary. Hence, we need to inform the linker to separate the two pieces with a gap of at least the size of the gap on the storage.

Octocontrabass · Post by **Octocontrabass** » Sun May 27, 2018 8:59 am

Octacone wrote:Placeholders? I didn't say/hear anything about placeholder. What do you mean by that. Like reserving some free space that acts like padding? But again how do I know where to put that space?

You know where to put that space because it's the parts you won't overwrite when you install your bootloader onto the disk.

For example, to specify seven bytes at offset 0x123 that your code must not overwrite, you might do something like this:

Code: Select all

times 0x123-($-$$) db 0 ; pad to offset 0x123, in case you don't use all of the space before it
times 7 db 0 ; padding that will be replaced with values taken from the disk during installation

Octacone wrote:Wait wait wait... So you are saying that I am allowed to use X+2 and X+12? What!? This changes everything. If I could do so then none of this would exist, no problems.
That could potentially solve all my problems. Man, you are my savior.

There are quite a few sectors you could potentially use, but those two are safest.

Octacone wrote:There is a separate loader that loads both of those sectors into memory, there is no need for them to load each other.

Is that separate loader in the MBR? What do you do if someone wants to install GRUB to the MBR instead?

Brendan · Post by **Brendan** » Mon May 28, 2018 1:35 am

Hi,

Octocontrabass wrote:
Octacone wrote:Wait wait wait... So you are saying that I am allowed to use X+2 and X+12? What!? This changes everything. If I could do so then none of this would exist, no problems.
That could potentially solve all my problems. Man, you are my savior.
There are quite a few sectors you could potentially use, but those two are safest.

Who says?

The BPB has a "number of reserved sectors" field. For FAT32, this field can be 1 (if there's no optional FSInfo sector), or it could be 2 (if there is an optional FSInfo sector), or it could be 3, or 4, or 123. Some software (maybe a few specific versions of Windows) might use the value 32 for whatever reason; but any sectors that were reserved by whoever created the partition may be used by whoever created the partition for whatever they feel like. There is no guarantees, and if there actually are reserved sectors that aren't used by whoever created the partition, then that's pure random luck and not something that can or should be relied on.

In theory; the only case where you can assume any of the reserved sectors can be used by your OS is if your OS created the partition; but in practice (because you're using a different operating system's file system) Windows will assume it owns the partition (when it doesn't) and "Windows crapware" will trash data that it has no right to touch. This includes "future Windows crapware" that doesn't exist yet (e.g. if you make a random and unfounded assumption that "X+12" seems to be sort of safe maybe, then any future Windows update and any future third-party tool designed for Windows can break your random and unfounded assumption on a whim).

The best solution is to refuse to boot from extremely insecure (and poorly designed) file systems that belong to Microsoft. For a simple example, you could create a tiny partition for your boot loader (and store the boot loader in consecutive sectors, with no file system at all); even if the boot loader happens to load files from a FAT file system in a separate partition.

Cheers,

Brendan

Octacone · Post by **Octacone** » Sat Jun 02, 2018 4:10 am

Just a quick update. I read all of your comments.
After a lot of tinkering around I decided that there was a third solution.
Basically I realized I could make my code much shorter by removing all the info messages.
Also my code looks very bloated at the time. My Assembly has greatly improved which means it's time to rewrite it (this part only).
If I ever happen to be in the same situation, I will just put it at X+2. I don't need to worry about Windows since this OS is leaning towards embedded x86.
Thanks for helping me out.

Schol-R-LEA · Post by **Schol-R-LEA** » Sat Jun 02, 2018 7:52 am

If you don't mind me asking: if the goal is embedded systems, why use FAT at all? I think that this relates back to Brendan's point, as his whole argument was that much of the trouble came from using FAT32 in the first place.

I am not saying that you shouldn't use FAT (though I personally don't recommend it), but it might help if you could explain why you chose FAT over, say, SFS or ext2.

For that matter, just what sort of 'embedded applications' are you aiming at? I gather that this is still going to be on the standard PC platform (otherwise the whole issue would be different - even if the platform still uses an x86 CPU, an industrial controller or an SBC such as an UDOO X86 wouldn't necessarily be all that similar to a PC in other regards) which implies that it is a control unit for some remote device rather than as a microcontroller type unit. You'd be looking at similar use cases to, say, RDOS, or Qnx, or Menuet, rather than, say, most eCos installations.

So, some additional information about your desired results might help here.

EDIT: Looking back over your previous posts (including some I had responded to myself, including one where I had previously mentioned SFS) I see that in the past you have argued that you needed FAT support in order to support data cards and Flash memory drives. However, needing to support FAT32 doesn't imply needing to boot from a FAT32 disk.

It is possible to format (for example) a microSD card in a different file system (or even some ad-hoc partition structure with just your OS in it), and boot from that, while still being able to read and write to FAT32 partitions (even on the same device - yes, such devices can be partitioned, though some older models of Flash drives require you to respect a hidden partition containing proprietary code, something that would usually preclude their use on anything other than a Windows PC anyway).

This might be particularly relevant regarding an embedded controller or SBC, though admittedly some such devices have their own limitations in this regard (e.g., the weird boot structure used by the RPi).

Octacone · Post by **Octacone** » Wed Jun 06, 2018 1:10 pm

@Schol-R-LEA

Well, it is cheap, easy to implement, widely supported and used, fulfills all of my requirements.
I can't specifically say what I am doing, because nobody has ever done that before and it would seem silly (trust me it is not). (hardware talk)
Actually I am trying to create an OS that has two purposes, to be used as a desktop OS, to be used in an embedded type environment. I plan on creating a common core for both of them and the changing the outer appearance afterwards.
Yes you are right about the SD card support. It is mandatory. However I doesn't matter to me which filesystem I choose in the end.
You can put it like this: FAT 32 -> Master Data-keeping Filesystem, Other Filesystem (maybe EXT2) -> Master OS Filesystem.
The thing is, I am still learning, reading the docs, I've only recently wrapped my head around FAT, so EXT2 would be a bit too much to chew at the moment.

OSDev.org

NASM Data Skipping

NASM Data Skipping

Re: NASM Data Skipping

Re: NASM Data Skipping

Re: NASM Data Skipping

Re: NASM Data Skipping

Re: NASM Data Skipping

Re: NASM Data Skipping

Re: NASM Data Skipping

Re: NASM Data Skipping

Re: NASM Data Skipping

Re: NASM Data Skipping

Re: NASM Data Skipping

Re: NASM Data Skipping

Re: NASM Data Skipping

Re: NASM Data Skipping