org directive in bootloader

ITchimp · Post by **ITchimp** » Thu May 14, 2020 3:08 am

org 0x7c00 loads the bootloader at 0x7C00 memory address, that is fine...

but what about the segment addressing scheme of real mode x86

bp, sp has to be paired with ss
eip has to be paired with ds

but based on experience so far, org 0x7c00 doesn't really set ss and ds to the proper
value... ss and ds are both 0x0000, bp and sp seems to be the value that I set, e.g 0x2000
which I printed out to verify, they should be 0x2000+0x7c00

does that mean with org 0x7c00 directive, segment addressing mode is effectively
disabled?

the whole thing with those register values seem very inconsistent!!! someone explain it...

iansjack · Post by **iansjack** » Thu May 14, 2020 3:45 am

ORG does not set the address that the program is loaded at.

ORG does not set any segment registers.

ORG is not an instruction to the assembler, as such, more information that you are supplying. The programmer is telling the assembler what address the program is to be loaded at so that it can make the appropriate adjustments to addresses it generates. It is up to the programmer to ensure that the program is loaded at that address and that the segment registers are set to the appropriate values. Making assumptions about the values of the segment registers, without explicitly loading them, is one of the prime causes of misbehaviour of assembler programs.

Octocontrabass · Post by **Octocontrabass** » Thu May 14, 2020 4:03 am

ITchimp wrote:org 0x7c00 loads the bootloader at 0x7C00 memory address, that is fine...

No. The org statement tells the assembler that your bootloader will be loaded at offset 0x7c00 relative to the segment base address, so the assembler can translate labels into the correct addresses. It's up to you to ensure the segment base address and offset add up to the correct linear address.

The BIOS always loads the bootloader at linear address 0x7c00, so you need a segment base address of 0 if the offset is 0x7c00.

For example, if you wanted to use a segment base address of 0x1230, you would need "org 0x69d0" so that the base and offset add up to linear address 0x7c00. (This is just an example; I don't think anyone would ever want to do this.)

In real mode, the segment base address is the value you load in the segment register multiplied by 16 (0x10). That's equivalent to shifting left by four bits. So, if you want the segment base address to be 0x1230 like the above example, you would set the segment register to 0x123.

ITchimp wrote:but based on experience so far, org 0x7c00 doesn't really set ss and ds to the proper value...

It doesn't set SS or DS at all; you need code to set the segment registers.

ITchimp wrote:does that mean with org 0x7c00 directive, segment addressing mode is effectively disabled?

Setting the segment base address to 0 effectively disables segmentation. When the segment base address is 0, the offset will always be equal to the linear address. The "org 0x7c00" statement just tells your assembler how to turn labels into the correct addresses when the segment base address is 0.

sunnysideup · Post by **sunnysideup** » Thu May 14, 2020 8:28 am

Forget everything about segment registers, the BIOS, bootloader, etc.

1. The org statement only works when you want your assembler to output a raw binary file.
2. It affects LABELS in your assembly code. For example, in nasm, your labels are things like label1: ,etc.
3. It adds the value that you have specified to org to your original label offset. For example, if you have org 50, a label corresponding to the first instruction will have a value of 50.
4. This works for all labels, including $, which is a label for the current instruction.
5. There is simply nothing related to segment registers or anything. You can do away with org by manually adding the required value, eg. mov eax,[label + 50]. However, you need to know exactly what you are doing and this won't work with something like jmp (label+50). This is because these jumps will be relative, and the assembler doesn't offset $.

nexos · Post by **nexos** » Thu May 14, 2020 12:26 pm

An ORG directive simply tells the assembler where to base addresses from. It has nothing to do with segments. If you put org 0, all offset address are based from 0x0. Segments must be set explicitly. More about Real Mode's memory model can be found at http://www.brokenthorn.com/Resources/OSDev4.html.

neon · Post by **neon** » Thu May 14, 2020 3:23 pm

Hi,

Quick clarification... The org directive sets the segment origin (which would be the default segment if non specified.) If multiple org directives are specified, NASM always applies the last one defined within its respective segment. The segment origin is applied to all instructions with relocatable addresses when generating the final code. As noted above, the general rule of thumb is to think of it as label + org. Relative relocations also apply, however doesn't change the basic rule of thumb (and isn't applicable here.) For flat binaries, the assembler will insert padding and merge the sections:

Code: Select all

bits 16
section .text
org 0x7c00
label1:
mov ax, label1 ; B8 00 7C
mov ax, label2 ; B8 00 7E
; <padding>
section .data
org 0x7e00
label2: nop

It is critical to not confuse org (which applies to relocations) with segment registers (which has nothing to do with org) because the code you write will vary depending on both:

Code: Select all

org 0
mov ax, 0x7c0
mov ds, ax
mov ax, [label] ; 0x7c0:label + 0
-or-
org 0x7c00
mov ax, 0
mov ds, ax
mov ax, [label] ; 0:label + 0x7c00

Schol-R-LEA · Post by **Schol-R-LEA** » Thu May 14, 2020 10:33 pm

To address another aspect of this which has been mostly ignored so far:

ITchimp wrote:bp, sp has to be paired with ss

By 'paired' I assume you mean that they are offsets of the Stack Segment (SS) register. This is a 'yes, but actually, no' sort of thing. Why do I say this? Because SP points to the top of the process stack (which in this case is simply the hardware stack, since no other processes exist), and in x86, hardware stacks grow downward. While SP is a stack offset, it has some special behaviors due to its intended use.

When SP is 0x0000, then the stack is empty. When you PUSH a 16-bit value onto the stack (stack values are always the system word in length, so in real mode and 16-bit pmode they are two bytes, in 32-bit pmode they are four, etc.), it first decrements SP - which, wrapping around, becomes 0xFFFE - then copies the pushed value to the bytes at addresses 0xFFFF and 0xFFFE, in little-endian order (i.e., with the high-order byte in the lower address; so if you PUSHed 0x1234, then 0x34 would be at 0xFFFE and 0x12 would be at 0xFFFF).

As for BP, that is usually only used for function stack frames; the idea is that when you enter the function, you push the existing BP, then copy the current SP to BP, allowing you to then a) use BP with an index to access the arguments a local variables, and b) restore SP before returning from the function call (which itself changes SP once the jump back to the caller is completed, but bear with me on this; if you don't restore the stack, it won't be pointing to the return address and the RET will jump to who-knows-where). While it is indeed an offset from SS, it does not get automatically set by anything.

ITchimp wrote:eip has to be paired with ds

No. First off, I assume you meant IP, not EIP, since this is real mode. More importantly, the Instruction Pointer is an offset from the Code Segment (CS) origin, not the Data Segment (DS) origin. While a bootloader will almost always set DS == CS, this is not necessarily going to be the case, and it is not an automatic one by any means.

While you can (and in this instance should) use a MOV to set SS or DS, you can only set CS with a FAR jump. This is why most boot loaders (after the initial NEAR JMP at the start) begin with something like this:

Code: Select all

        JMP FAR 0x7C00:0000       ; this isn't really needed with any recent systems, but older boot loaders usually had it
        MOV DS, CS                ; copy CS -> DS
        MOV SS, CS                ; copy SS -> DS
        MOV AX, 0xFFFE            ; set value to load into SP
        MOV SP, AX                ; set the top of stack, advanced slightly as a precautionary step

... in which case, the data and stack are in the same segment as the code, with the data following just after the what is sometimes referred to as the 'tiny memory model'.

In general, you will want the code and data segments to overlap, since it is likely that you will need to include some data as part of the boot sector image. Things are more flexible with the stack, which doesn't need to overlap this way but should still be explicitly set by the boot loader to ensure that the memory layout is the one you expect it to be. Since most boot-sector bootloaders don't touch the stack much if at all, tiny model is just a convenient way to handle everything all at once.

As I noted, the far jump I showed there is only really needed if you aren't certain about the state the BIOS has the registers in when the it jumps to the boot loader. While this was a concern on older PCs (before, say, 2000 or so), newer systems which still have a Legacy BIOS boot option will invariably have the CS:IP set to 7C00:0000 on boot loader entry (while still newer systems may not have a Legacy BIOS at all, only the UEFI firmware, in which case this type of loader won't work at all). You'll still see it in some boot loaders, but at this point it is just a hold-over from the bad old days of the PC/XT clone market.

Eventually, you'll need to re-arrange the segment registers (or segment selectors, once you've moved to pmode or long mode), but for now this should be sufficient for the first stage boot loader.

I'm a bit tired at the moment, so if I've made any incorrect or misleading statements, feel free to correct me on them.

nullplan · Post by **nullplan** » Thu May 14, 2020 11:13 pm

Nice explanation, but you got some details wrong:

Schol-R-LEA wrote:While this was a concern on older PCs (before, say, 2000 or so), newer systems which still have a Legacy BIOS boot option will invariably have the CS:IP set to 7C00:0000 on boot loader entry

That's 0000:7C00. You mixed CS and IP. CS will be initialized to 0 and IP will be set to 7C00. Though if you initialize a far pointer in memory, the first word will be 7C00 and the second one will be 0. Yeah, little endian gets confusing sometimes.

Schol-R-LEA wrote:

Code: Select all

       JMP FAR 0x7C00:0000       ; this isn't really needed with any recent systems, but older boot loaders usually had it
        MOV DS, CS                ; copy CS -> DS
        MOV SS, CS                ; copy SS -> DS
        MOV SP, 0xFFFE            ; set the top of stack, advanced slightly as a precautionary step

The first line (assuming that was the same mistake as above) creates an infinite loop, if it is the first instruction of the boot loader, since it will always just jump to itself. The next two are not legal x86 instructions. In particular, there is no need to read out CS, since you know the value beforehand. You just set it.

Since you know the linear address the boot sector is loaded to, and since short jumps and calls are relative, there is usually no need to even initialize CS. CS:IP will be set up to combine to linear address 0x7C00 and will be set such that the entire sector can run. I mean, at the least CS can be 0, which gives more than 32kB of memory left for IP to overflow, so even more than one sector can be executed without knowing the precise values of CS and IP. And for data references you can set origin and data segment correctly. Like this:

Code: Select all

org 0x7C00
xor ax, ax
mov ds, ax
mov es, ax
lss sp, [stackptr]
[...]
stackptr: dd 0x00007c00

I would set SP at the start to 7C00, since the memory between 0x600 and 0x7c00 is unused, and this allows us to use the memory beyond the boot sector to load more stuff as necessary. It is unlikely that the stack would ever grow so large that it would write into the BDA. Of course, you can enforce this hope by setting SS to 0x0060 and SP to 0x7600. Also, using LSS to load a new stack avoids the need for CLI and STI around MOV instructions.

Schol-R-LEA wrote:... in which case, the data and stack are in the same segment as the code, with the data following just after the what is sometimes referred to as the 'tiny memory model'.

Memory models are a term of art typically referring to ABIs. This one references the models used in DOS. However, in a boot sector, we are pre-DOS. So they don't really apply. Also, segments overlap massively, meaning that different segments don't necessarily mean different memory. I just advocated for putting SS into segment 0x0060 but still below the code and data, which are in segment 0x0000.

Octocontrabass · Post by **Octocontrabass** » Fri May 15, 2020 3:49 am

nullplan wrote:Also, using LSS to load a new stack avoids the need for CLI and STI around MOV instructions.

MOV to SS prevents interrupts between itself and the following instruction, so you can also avoid the CLI/STI if you arrange your instructions so that the MOV to SP immediately follows the MOV to SS, like such:

Code: Select all

mov ss, ax
mov sp, 0x7c00

This might also save you a few bytes, which is important in a boot sector.

(And if you're worried about the old 8086/8088 erratum that incorrectly allowed interrupts immediately after a MOV to SS, you can't use the LSS instruction anyway, since that instruction was introduced with the 386.)

Schol-R-LEA · Post by **Schol-R-LEA** » Sat May 16, 2020 11:01 am

@Nullplan: I had a feeling I had gotten some things wrong, though I am now kicking myself for trying to go from memory rather than checking the details more carefully. Thank you for all of that.

Regarding the far jump, I meant to put a label at the second instruction in that set and jump to that, but given all of the other mistakes I made, this is the least of the problems that code example has. Sorry.

EDIT: Wow, I can't even address the correct person in my replies. That's worrying. I've fixed it now.

OSDev.org

org directive in bootloader

org directive in bootloader

Re: org directive in bootloader

Re: org directive in bootloader

Re: org directive in bootloader

Re: org directive in bootloader

Re: org directive in bootloader

Re: org directive in bootloader

Re: org directive in bootloader

Re: org directive in bootloader

Re: org directive in bootloader