To address another aspect of this which has been mostly ignored so far:
ITchimp wrote:bp, sp has to be paired with ss
By 'paired' I assume you mean that they are offsets of the Stack Segment (SS) register. This is a 'yes, but actually, no' sort of thing. Why do I say this? Because SP points to the top of the process stack (which in this case is simply the hardware stack, since no other processes exist), and in x86, hardware stacks grow downward. While SP is a stack offset, it has some special behaviors due to its intended use.
When SP is 0x0000, then the stack is empty. When you PUSH a 16-bit value onto the stack (stack values are always the system word in length, so in real mode and 16-bit pmode they are two bytes, in 32-bit pmode they are four, etc.), it first decrements SP - which, wrapping around, becomes 0xFFFE - then copies the pushed value to the bytes at addresses 0xFFFF and 0xFFFE, in little-endian order (i.e., with the high-order byte in the lower address; so if you PUSHed 0x1234, then 0x34 would be at 0xFFFE and 0x12 would be at 0xFFFF).
As for BP, that is usually only used for function stack frames; the idea is that when you enter the function, you push the existing BP, then copy the current SP to BP, allowing you to then a) use BP with an index to access the arguments a local variables, and b) restore SP before returning from the function call (which itself changes SP once the jump back to the caller is completed, but bear with me on this; if you don't restore the stack, it won't be pointing to the return address and the RET will jump to who-knows-where). While it is indeed an offset from SS, it does not get automatically set by anything.
ITchimp wrote:eip has to be paired with ds
No. First off, I assume you meant IP, not EIP, since this is real mode. More importantly, the Instruction Pointer is an offset from the Code Segment (CS) origin, not the Data Segment (DS) origin. While a bootloader will almost always set DS == CS, this is not necessarily going to be the case, and it is not an automatic one by any means.
While you can (and in this instance should) use a MOV to set SS or DS, you can only set CS with a FAR jump. This is why most boot loaders (after the initial NEAR JMP at the start) begin with something like this:
Code: Select all
JMP FAR 0x7C00:0000 ; this isn't really needed with any recent systems, but older boot loaders usually had it
MOV DS, CS ; copy CS -> DS
MOV SS, CS ; copy SS -> DS
MOV AX, 0xFFFE ; set value to load into SP
MOV SP, AX ; set the top of stack, advanced slightly as a precautionary step
... in which case, the data and stack are in the same segment as the code, with the data following just after the what is sometimes referred to as the 'tiny memory model'.
In general, you will want the code and data segments to overlap, since it is likely that you will need to include some data as part of the boot sector image. Things are more flexible with the stack, which doesn't need to overlap this way but should still be explicitly set by the boot loader to ensure that the memory layout is the one you expect it to be. Since most boot-sector bootloaders don't touch the stack much if at all, tiny model is just a convenient way to handle everything all at once.
As I noted, the far jump I showed there is only really needed if you aren't certain about the state the BIOS has the registers in when the it jumps to the boot loader. While this was a concern on older PCs (before, say, 2000 or so), newer systems which still have a Legacy BIOS boot option will invariably have the CS:IP set to 7C00:0000 on boot loader entry (while still newer systems may not have a Legacy BIOS at all, only the UEFI firmware, in which case this type of loader won't work at all). You'll still see it in some boot loaders, but at this point it is just a hold-over from the bad old days of the PC/XT clone market.
Eventually, you'll need to re-arrange the segment registers (or segment selectors, once you've moved to pmode or long mode), but for now this should be sufficient for the first stage boot loader.
I'm a bit tired at the moment, so if I've made any incorrect or misleading statements, feel free to correct me on them.