ljmp offset

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
asmboozer

ljmp offset

Post by asmboozer »

in the linux2.4
the file bootsect.S
has statement below

ljmp $INITSEG , $go

go:
movw ....

I don't know what's the value of $go,
after jmp executed, will cs:(e)ip be $INITSEG:$go?
if it is, $go would have two different meanings.
one is the offset from globe start symbol,
another is the offset in the executing memory, in this case,
it 's the offset of the memory started at ($INITSEG<< 4),

do they have same values? do they have some relations?



I want to ask what's the value of the offset $go?
and since the jmp instruction will cause
the prog to execute at $INITSEG << 4 + $go,
I don't know the difference between the two offsets
(one is the $go in the source code, another is the offset
of memory).
can anyone help me understand it?
Regards.
User avatar
df
Member
Member
Posts: 1076
Joined: Fri Oct 22, 2004 11:00 pm
Contact:

Re:ljmp offset

Post by df »

all it is, is a jmp instruction jumping to a fixed address.
it has nothing to do with the global start symbol.

the LJMP is not an IP relative opcode. its fixed address.
-- Stu --
asmboozer

Re:ljmp offset

Post by asmboozer »

in this case, where will ljmp jump at?
what's the value of the $go? 0? or other?

if ljmp at ($INITSEG << 4) + $go, the register (e)ip would now have the same value. then how to execute the code movw ... after the label go?
asmboozer

Re:ljmp offset

Post by asmboozer »

anyone has better answer for me? thanks
asmboozer

Re:ljmp offset

Post by asmboozer »

it's not the same as
jmp 07c0h:start

start:

which I can understand in that after POST, the bios bring us at 0x7c0.

whereas in the bootsect.S, $INITSEG is not the same case.
asmboozer

Re:ljmp offset

Post by asmboozer »

16bit asm code guru here? I think you would know the knack earlier. would you tell me?
Curufir

Re:ljmp offset

Post by Curufir »

Eg

Code: Select all

ORG SOMEORG

JMP SOMESEG:SOMEOFFSET

SOMEOFFSET:
The value SOMEORG is the address where the program is assumed to start.

The value SOMEOFFSET is an offset from the start of the program. All labels are handled like this by the assembler.

Since the jump is long this is encoded as an absolute address by adding SOMEORG to SOMEOFFSET.

The logical address (cs:ip) after the jump will be SOMESEG:(SOMEOFFSET+SOMEORG).

The physical address after the jump will be ( ( SOMESEG * 0x10) + (SOMEOFFSET + SOMEORG) ).

Confusion sometimes arises because you have to handle segments manually. Most assemblers can only use ORG/Labels to determine an appropriate IP/EIP, not an appropriate CS.

Hopefully something in there answers your question (I couldn't understand what you were asking so just dumped various information).
asmboozer

Re:ljmp offset

Post by asmboozer »

thanks Curufir,
I know the example you listed.

I said

it's not the same as
jmp 07c0h:start

start:

which I can understand in that after POST, the bios bring us at 0x7c0.

whereas in the bootsect.S, $INITSEG is not the same case.

$INITSEG is not the ORG SOMEORG, I don't have the bootsect.S now, tormorrow , I will paste it here.
Curufir

Re:ljmp offset

Post by Curufir »

Ok. I think, maybe, that I see your problem.

The code is loaded into the segment 0x7c0 and BIOS starts execution at 0x7c0:0x0.

The first thing Linux's Bootsect.S does is to copy itself to segment INITSEG (Usually 0x9000).

The long jump:

Code: Select all

ljmp $INITSEG , $go
Is then done to set CS. This is so that the processor is using the copied version not the original.

This works just fine because the assembler assumes all labels occur within the same segment and Linux effectively uses an ORG of 0x0 (It doesn't actually specify it). When the jump changes the code segment to run in the copy, all the label addresses/offsets still work because that assumption ( ORG 0x0 ) is still valid for the copied Bootsect.S.
asmboozer

Re:ljmp offset

Post by asmboozer »

thanks for the information.


movw   $BOOTSEG, %ax
   movw   %ax, %ds
   movw   $INITSEG, %ax
   movw   %ax, %es
   movw   $256, %cx
   subw   %si, %si
   subw   %di, %di
   cld
   rep
   movsw
   ljmp   $INITSEG, $go


when the bootsect.S 's binary is loaded into memory,
it begins at 0x7c0:00 address, so the $go has a value base on the address, am I right on this point?

when execute ljmp $INITSEG, $go, it will jump at $INITSEG:$go, $go take the value as above( but I don't what value it is yet, can you tell me?)
then it executs the machine instruction at $INITSEG:$go, am I right on this point? if it is, how can the machine instruction at $INITSEG:$go is same as instructions below label go since there are in different memory region(one begines at 0x7c00, another begins at $INITSEG<<4).
User avatar
df
Member
Member
Posts: 1076
Joined: Fri Oct 22, 2004 11:00 pm
Contact:

Re:ljmp offset

Post by df »

did you read the asm? did you see the movsw? it copied itself.
-- Stu --
asmboozer

Re:ljmp offset

Post by asmboozer »

ok, I may get it. it's because that the code itself has been copied to $INITSEG , and with what you said below

This works just fine because the assembler assumes all labels occur within the same segment and Linux effectively uses an ORG of 0x0 (It doesn't actually specify it). When the jump changes the code segment to run in the copy, all the label addresses/offsets still work because that assumption ( ORG 0x0 ) is still valid for the copied Bootsect.S.


so if the code itself has not been copied, I think ljmp $INITSEG, $go is wrong, am I right?


in later , it call ljmp $SETUPSEG, $0, it will work fine, because
the ip is already in $INITSEG, and after load_setup, $SETUPSEG ( 0x90200) is ($INITSEG <<4+ 512)
asmboozer

Re:ljmp offset

Post by asmboozer »

df wrote: did you read the asm? did you see the movsw? it copied itself.

it's not I don't read it, but Iforget the fact .lol
Curufir

Re:ljmp offset

Post by Curufir »

When it's generating the code the assembler treats all labels as offsets from the start address, with the start address being defined by the ORG statement (Which, as mentioned, is assumed to be ORG 0x0 if not present in the code).

So if, for the sake of example, I have something like this:

Code: Select all

ORG 0x0

mov %DS, %AX

ljmp   $SOMESEG, $mylabel

mylabel:
Now I happen to know that 'mov %DS, %AX' is encoded as 2 bytes, and that a long jump in real-mode is encoded as 5 bytes. So the assembler will treat the label 'mylabel' as having a value of 0x7 because the instruction following the label 'mylabel' will have an offset of 0x7 bytes into the file and the start address is 0x0 (Because I used ORG 0x0).

What this means is that the instruction 'ljmp $SOMESEG, $mylabel' will load CS with SOMESEG, and IP with 0x7. The effect is to jump to the address SOMESEG:0x7.

Now let us say that my little example is loaded to 0x07C0:0x0000 (As Bootsect is).

Let us assume for a moment that I set SOMESEG to be 0x07c0.

When the instruction 'ljmp $SOMESEG, $mylabel' is reached the processor will jump to 0x07c0:0x0007. This will be the address of the instruction following the label 'mylabel'.

Now let's say I have made a copy of my program at 0x9000:0x0000.

If I set SOMESEG to be 0x9000 then when the instruction 'ljmp $SOMESEG, $mylabel' is reached the processor will jump to 0x9000:0x0007. This will be the address of the instruction following the label 'mylabel' in the copy of my program.

I can switch between the original and copy as many times as I like using 'ljmp' to change the code segment and have everything work because the labels are just offsets within the segment and our assumption that the offset for the start of the program is 0x0 (ORG 0x0) is true for both the copy and the original.

What I couldn't do is make a copy of my program at 0x9000:0x0100 and try to do the same thing. Because a long jump is encoded as an absolute address (Not relative) my 'ljmp $SOMESEG, $mylabel' would still be jumping to 0x9000:0x0007 which would no longer be the instruction following 'mylabel' in the copy. In this case my assumption of ORG 0x0 is not true for the copy of the program. The start address of the copy would actually be 0x100 and I would have to adjust the code to compensate.

***

Back to bootsect.S

It should be pretty obvious now what's going on, but I'll run through it anyway.

Ok, I'm gonna run through this very very slowly to try and clear it up.

Code: Select all

   movw   $BOOTSEG, %ax
   movw   %ax, %ds
   movw   $INITSEG, %ax
   movw   %ax, %es
   movw   $256, %cx
   subw   %si, %si
   subw   %di, %di
   cld
   rep
   movsw
First thing to realise is that there is no ORG statement at the start of this code. Because there is no ORG statement the assembler assumes ORG 0x0.

DS is loaded with BOOTSEG (0x7c0)
ES is loaded with INITSEG (Usually 0x9000)
SI and DI are set to 0, direction flag is cleared.

BIOS loads the bootsector to 0x7c0:0x0000 and executes the first instruction. At this point CS:IP = 0x7c0:0x0000 (Or an equivalent address).

The bootsector then copies 512 bytes from 0xBOOTSEG:0x0000 to 0xINITSEG:0x0000. This has the effect of making a copy of itself which starts at 0xINITSEG:0x0000.

Because the start offset of both the orginal and copy is 0x0 our ORG 0x0 assumption is correct for both the original and the copy. So the label offsets will be correct in both.

CS is still BOOTSEG at this point.

The code then makes a long jump to change CS into INITSEG.

Code: Select all

   ljmp   $INITSEG, $go
After the jump CS = $INITSEG and ip = $go and execution continues with the instruction following 'go' in the copy.

***

Hopefully that makes sense to you.
asmboozer

Re:ljmp offset

Post by asmboozer »

Curufir wrote: What I couldn't do is make a copy of my program at 0x9000:0x0100 and try to do the same thing. Because a long jump is encoded as an absolute address (Not relative) my 'ljmp $SOMESEG, $mylabel' would still be jumping to 0x9000:0x0007 which would no longer be the instruction following 'mylabel' in the copy. In this case my assumption of ORG 0x0 is not true for the copy of the program. The start address of the copy would actually be 0x100 and I would have to adjust the code to compensate.

***
thanks first.
in this case, do we have workarounds?how do you compensate for it?
Post Reply