ORG and near jumps in bootloader

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
4dr14n31t0r
Posts: 13
Joined: Sun Nov 05, 2017 1:46 pm
Libera.chat IRC: 4dr14n31t0r

ORG and near jumps in bootloader

Post by 4dr14n31t0r »

I have trouble understanding the ORG preprocessor directive.

I know that near jumps are the same as far jumps but using as segment the value of cs register:

Code: Select all

jmp near 0x1234 --> jmp far cs:0x1234
I know that the value of cs register is undefined when the bootloader starts. However I checked it's value on qemu and it is always zero.
Whenever I do a far jump the result is exactly the same with or without

Code: Select all

org 0x7c00
But if

Code: Select all

jmp near 0x1234
is the same as

Code: Select all

jmp cs:0x1234
and cs is set to zero, then

Code: Select all

jmp near 0x1234
is the same as

Code: Select all

jmp 0x0000:0x1234
If the location where the far jump jumps doesn't change with(out) the org directive and the cs register is always set to zero, being jmp near 0x1234 the same as jmp 0x0000:0x1234, Why does it change when I use the near jump? What does ORG exactly does?
As Paul R says here:https://stackoverflow.com/a/3407190/5744858
ORG is used to set the assembler location counter
What is that location counter and how does it change the operand of the jmp instruction?

Note that actually I am not jumping to 0x1234. That location is just an example.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: ORG and near jumps in bootloader

Post by Brendan »

Hi,
4dr14n31t0r wrote:I have trouble understanding the ORG preprocessor directive.
OK, let's start with a very simple example (with data and no code at all):

Code: Select all

    org 0x0000

    dw myThing              ;Store the address of the "myThing" label in the output file

myThing:
In this case, the assembler thinks that the output file will be loaded at offset 0x0000 within the segment, so it determines that "myThing" is 2 bytes after that at offset 0x0002, so the file will contain the data 0x0002 (or 0x02, 0x00 as bytes because 80x86 is "little endian").

What if we change the ORG?

Code: Select all

    org 0x1234

    dw myThing              ;Store the address of the "myThing" label in the output file

myThing:
In this case, the assembler thinks that the output file will be loaded at offset 0x1234 within the segment, so it determines that "myThing" is 2 bytes after that at offset 0x1236. The file will contain the data 0x1236 (or 0x36, 0x12 as bytes because 80x86 is "little endian").

Let's add a move instruction:

Code: Select all

    org 0x1234

    dw myThing              ;Store the address of the "myThing" label in the output file

myThing:
    mov si,myThing
Here, it's similar to before - the assembler thinks that the output file will be loaded at offset 0x1234 within the segment, so the value 0x1236 is moved into SI.

Let's try some jumps:

Code: Select all

    org 0x1234

variable:
    dw myThing              ;Store the address of the "myThing" label in the output file

myThing:
    mov si,myThing
    jmp 0x0000:myThing      ;Absolute far jump
    jmp word [variable]     ;Absolute indirect jump
In this case the first jump instruction will become "jmp 0x0000:0x1236". The second jump will become "jmp word [0x1234]", and because the value at offset 0x1234 is 0x1236 it'll end up jumping to offset 0x1236.

Every single thing I've mentioned so far depends on what ORG says; and if you change ORG everything else will change.

Let's try some cases where ORG is irrelevant:

Code: Select all

    org 0x0000

variable:
    dw 0x1236

myThing:
    mov si,0x1236
    jmp 0x0000:0x1236      ;Absolute far jump
    jmp word [0x1234]      ;Absolute indirect jump
    jmp myThing
For most of these cases the assembler uses the value you told it to use and doesn't calculate the value itself, so the ORG makes no difference. Of course if you add or remove anything you'll have to calculate all of the values yourself by hand (and if you get one wrong it will create bugs), so it's a massive code maintenance nightmare.

The last instruction ("jmp myThing") is a relative jump. For this the assembler determines the address of the target "myThing" (which depends on ORG and will be wrong if ORG is wrong) and then subtracts the address of the byte after the instruction (which also depends on ORG and will be wrong if ORG is wrong); but this subtraction cancels out. Essentially it's like this:

Code: Select all

    (bytes_from_start_of_file_to_target + ORG) - (bytes_from_start_of_file_to_address_after_instruction + ORG)
Which is the same as this:

Code: Select all

    bytes_from_start_of_file_to_target - bytes_from_start_of_file_to_address_after_instruction
..which gives the same value regardless of ORG because the ORG cancels out.

However, for "jmp 0x1234" it'd be:

Code: Select all

    0x1234 - (bytes_from_start_of_file_to_address_after_instruction + ORG)
..which does depend on ORG because the ORG isn't cancelled out.

Now...

If you tell the assembler that the output file will be loaded at offset 0x1234 within a segment (by using "ORG 0x1234") but the file is actually loaded at offset 0x0000 within a segment, then the assembler will get everything that depended on ORG wrong. For normal code (that uses labels to avoid a code maintenance nightmare) this means that all of your code will be broken when ORG is wrong (except for things like relative jumps which don't depend on ORG).


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
4dr14n31t0r
Posts: 13
Joined: Sun Nov 05, 2017 1:46 pm
Libera.chat IRC: 4dr14n31t0r

Re: ORG and near jumps in bootloader

Post by 4dr14n31t0r »

Conclusion: Being $ the adress of the current instruction + ORG value, then

Code: Select all

jmp near 0x1234
is the same as

Code: Select all

jmp short 0x1234 - $
The problem was that I believed that

Code: Select all

jmp near 0x1234
was the same as

Code: Select all

jmp cs:0x1234
Why does people says that near jumps are jumps in the same segment when the cs register is not even used?
Sorry if I commit some mistakes writting english. It is not my native language. If you find a typo in my posts, send me a PM
User avatar
iansjack
Member
Member
Posts: 4706
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: ORG and near jumps in bootloader

Post by iansjack »

Because the segment register isn't used. So they can only be a jump to a location in the current segment.

(Well, of course it is used - it just isn't changed.)
User avatar
Schol-R-LEA
Member
Member
Posts: 1925
Joined: Fri Oct 27, 2006 9:42 am
Location: Athens, GA, USA

Re: ORG and near jumps in bootloader

Post by Schol-R-LEA »

Brendan wrote:The last instruction ("jmp myThing") is a relative jump. For this the assembler determines the address of the target "myThing" (which depends on ORG and will be wrong if ORG is wrong) and then subtracts the address of the byte after the instruction (which also depends on ORG and will be wrong if ORG is wrong); but this subtraction cancels out. Essentially it's like this:

Code: Select all

    (bytes_from_start_of_file_to_target + ORG) - (bytes_from_start_of_file_to_address_after_instruction + ORG)
Which is the same as this:

Code: Select all

    bytes_from_start_of_file_to_target - bytes_from_start_of_file_to_address_after_instruction
..which gives the same value regardless of ORG because the ORG cancels out.
I am afraid that it was my previous statements which were misleading to the OP. I thought that for an unspecified

Code: Select all

    jmp myThing
NASM would assemble it to something like FF <16-bit absolute offset>, but if I understand what you are saying correctly, it is assembling to E9 <16-bit relative offset>.

Now then, looking at what this opcode reference says about the JMP instruction, I see that FF takes a mod r/m <size> argument, and none of the opcodes which JMP can assemble to are non-indexed 16-bit absolute addresses. The only code that would assemble to FF would be of the form JMP [<register> + <displacement>], unless I am still mistaken.

Which means that JMP <label> would, as you stated, assemble to E9 <16-bit relative offset>.

I am not sure where my confusion arose, though I have some ideas (I think it came from incorrect recollection of things I had read back in the 1990s, though why I am so damnably befuddled is anyone's guess).

So, I apologize for my incorrect statements in the previous thread, 4dr14n31t0r. I clearly am still failing to do enough due diligence in my answers here.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
4dr14n31t0r
Posts: 13
Joined: Sun Nov 05, 2017 1:46 pm
Libera.chat IRC: 4dr14n31t0r

Re: ORG and near jumps in bootloader

Post by 4dr14n31t0r »

They key is that there is no jmp near absolute direct instruction:
http://x86.renejeschke.de/html/file_mod ... d_147.html
As you can see in that link, we only have 2 near jmp instructions that takes only 1 constant number:
E9 cw JMP rel16 Jump near, relative, displacement relative to next instruction.
E9 cd JMP rel32 Jump near, relative, displacement relative to next instruction.
However, both of them are relative jumps. If I want to use a jmp near absolute, it have to be indirect:
FF /4 JMP r/m16 Jump near, absolute indirect, address given in r/m16.
FF /4 JMP r/m32 Jump near, absolute indirect, address given in r/m32.
Sorry if I commit some mistakes writting english. It is not my native language. If you find a typo in my posts, send me a PM
User avatar
MichaelFarthing
Member
Member
Posts: 167
Joined: Thu Mar 10, 2016 7:35 am
Location: Lancaster, England, Disunited Kingdom

Re: ORG and near jumps in bootloader

Post by MichaelFarthing »

This is fundamentally accurate, though actually it is possible to have a modRm that consists of a displacement only
ie [ void register ] + displacement.

The (16 bit) coding would be FF 16 34 12 for the example previously given of jmp 0x1234 (absolute in current cs segment).

The way this would actually be written in the source code would depend on the assembler, and never to my knowledge having sought to do it I don't know how to. Instinctively, however, it might have to be written jmp []+0x1234
MichaelPetch
Member
Member
Posts: 799
Joined: Fri Aug 26, 2016 1:41 pm
Libera.chat IRC: mpetch

Re: ORG and near jumps in bootloader

Post by MichaelPetch »

4dr14n31t0r wrote:I know that the value of cs register is undefined when the bootloader starts. However I checked it's value on qemu and it is always zero.
Just because one environment appears to be 0 it doesn't mean others won't. Back in the old days the El Torrito specification suggested the default segment used to transfer control to a bootloader was 0x07c0 (and not 0x0000). In some versions of Bochs if you boot as a floppy or hard drive the segment is 0x0000 and if you boot from a CD-ROM it is 0x07c0. There are real world BIOSes (usually much older ones) where the segment may not be zero.

Although 0x07c0:0x0000 and 0x0000:0x7c00 point to the same physical address there are situations where a bootloader can be written in such a way that the code may fail depending on the segment used. I wrote about such situation in this Stackoverflow Question/Answer. Effectively you can write your code to avoid the rarer situations where the actual value of CS matters or you can have your bootloader do a FAR JMP to set explicitly set CS. The worst thing you can do is copy the value of CS to DS,ES etc. I don't ever recommend doing this without a FAR JMP preceding it:

Code: Select all

mov ax, cs
mov ds, ax
mov es, ax 
If you do this then you propagate a potentially unwanted value from CS to the other segments (especially DS).
Octocontrabass
Member
Member
Posts: 5586
Joined: Mon Mar 25, 2013 7:01 pm

Re: ORG and near jumps in bootloader

Post by Octocontrabass »

MichaelFarthing wrote:The way this would actually be written in the source code would depend on the assembler, and never to my knowledge having sought to do it I don't know how to. Instinctively, however, it might have to be written jmp []+0x1234
In NASM syntax, it's the same as any other effective address on any other instruction:

Code: Select all

mov ax, [0x1234]
jmp [0x1234]
I would expect other assemblers to also accept their usual syntax for effective addresses.
User avatar
MichaelFarthing
Member
Member
Posts: 167
Joined: Thu Mar 10, 2016 7:35 am
Location: Lancaster, England, Disunited Kingdom

Re: ORG and near jumps in bootloader

Post by MichaelFarthing »

I was thinking in Intel manual at the time, where for some reason the displacement is written outside the square brackets.
Post Reply