usefullness of seg:off pointers
usefullness of seg:off pointers
This is a question for people who wrote/are writing/plan to do so a real mode os.
Where should i use near pointers, and where far?
Is it common for caller to require calee to handle over 64kb?
For example int 13h use far pointers. Entire ivt is far pointer based, wich is not supprising.
In protected mode they even added things like RPL to ensure secure segment usage.
That means im not supposed to use near pointers.
So whats the use of near pointers? They existed (call/jmp/retn/) even in 8086 when nobody though of flat address space.
That means they had/have a valid use. What is it? Relative jumps - ok, loops. But relative call? Indirect near jumps/calls?
Tell me, when i am supposed to use far pointers, and when near. In real mode of course.
Where should i use near pointers, and where far?
Is it common for caller to require calee to handle over 64kb?
For example int 13h use far pointers. Entire ivt is far pointer based, wich is not supprising.
In protected mode they even added things like RPL to ensure secure segment usage.
That means im not supposed to use near pointers.
So whats the use of near pointers? They existed (call/jmp/retn/) even in 8086 when nobody though of flat address space.
That means they had/have a valid use. What is it? Relative jumps - ok, loops. But relative call? Indirect near jumps/calls?
Tell me, when i am supposed to use far pointers, and when near. In real mode of course.
Re: usefullness of seg:off pointers
Hi,
The longer answer is.... longer.
Because a segment is limited to 64 KiB, real mode OSs usually have several different "memory models" that effect segment usage. From memory:
A relative call is mostly just a near call with a shorter opcode (e.g. a 1-byte "distance from IP" stored in the instruction rather than a 2-byte "new value of IP" stored in the instruction).
Of course if you have to handle segmentation all over the place, and have to handle all the different memory models, and have to figure out what to do with free memory fragmentation and other problems, and have to write your own compiler/s because the only compilers that support real mode and segmentation also expect DOS; then you'd realise protected mode and paging is a lot easier in the long run (and usually faster and more efficient too). Note: I only mention this because a lot of beginners actually make the mistake of thinking "real mode" is easier, which is only true in the short term.
Cheers,
Brendan
The simple answer is, use near pointers everywhere you can, and use far pointers when you have to. Even in real mode loading a segment register adds some overhead.a5498828 wrote:Where should i use near pointers, and where far?
The longer answer is.... longer.
Because a segment is limited to 64 KiB, real mode OSs usually have several different "memory models" that effect segment usage. From memory:
- tiny - same segment used for all segment registers; fast (few segment register loads); "code+data+stack" is limited to a maximum of 64 KiB
- small- one segment for code, and one segment used for all data and stack segment registers; fast (few segment register loads); code limited to 64 KiB and "data+stack" is limited to 64 KiB
- medium- multiple segments for code, and one segment used for all data and stack segment registers; slightly slower ("far call" needed a lot but few data/stack segment register loads); code limited to 640 KiB and "data+stack" is limited to 64 KiB
- compact- one segment for code, and multiple segments used for data and stack; even slower ("far pointers" needed for all data accesses); code limited to 64 KiB and "data+stack" is limited to 640 KiB
- large - multiple segments for code and multiple segments for data and stack; even slower ("far call" and "far pointers"); "code+data+stack" limited to 640 KiB
- overlays - similar to "large", but code is split up into modules/overlays and loaded into memory when needed; even slower; "data+stack" limited to 640 KiB less space reserved for "base code + modules/overlays", but there's no limit on the number of overlays so code size is only really limited by disk size (or in practice, disk speed and performance requirements)
For things like the OS's API/s, the OS must use far pointers (to be able to support the more useful memory models, and other reasons), which mostly forces the callee to use far pointers when calling these API/s. For a program calling it's own code it depends on which memory model the program uses.a5498828 wrote:Is it common for caller to require calee to handle over 64kb?
Imagine a system with 100 separate pieces of code where each piece of code is less than 64 KiB. In this case you can use near pointers for all calls and jumps within the same piece of code; and only use slower far calls when something in one piece of code has to call something in another piece of code.a5498828 wrote:So whats the use of near pointers? They existed (call/jmp/retn/) even in 8086 when nobody though of flat address space.
That means they had/have a valid use. What is it? Relative jumps - ok, loops. But relative call? Indirect near jumps/calls?
A relative call is mostly just a near call with a shorter opcode (e.g. a 1-byte "distance from IP" stored in the instruction rather than a 2-byte "new value of IP" stored in the instruction).
Of course if you have to handle segmentation all over the place, and have to handle all the different memory models, and have to figure out what to do with free memory fragmentation and other problems, and have to write your own compiler/s because the only compilers that support real mode and segmentation also expect DOS; then you'd realise protected mode and paging is a lot easier in the long run (and usually faster and more efficient too). Note: I only mention this because a lot of beginners actually make the mistake of thinking "real mode" is easier, which is only true in the short term.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: usefullness of seg:off pointers
thank you for answer, real mode is easier for me because the same argument (near vs far) can be taken into protected and even long mode. And there you have thinkg like privilege levels, gates, tasks, and other thing. Paging is even more complicated. Real mode is just the most basic of it all, and when wrote right will work on cpus from 1970s
I just wonder how fast would code execute on oldest 8086 wich on core2duo is done in 1 second.
I think i will use near calls as you suggested for internal functions, and far for everything else.
Im talking about x86, forgot to mention.

I think i will use near calls as you suggested for internal functions, and far for everything else.
there is 1 byte relative call?A relative call is mostly just a near call with a shorter opcode (e.g. a 1-byte "distance from IP" stored in the instruction rather than a 2-byte "new value of IP" stored in the instruction).

Re: usefullness of seg:off pointers
Hi,
Makes me wonder how many people write position independent real mode code..
Cheers,
Brendan
I know it seems like real mode is "easy", and in the beginning it is. With protected mode and paging the protection stuff could be ignored, segmentation can be ignored, all the gates (except "interrupt trap gates") aren't strictly necessary, you wouldn't need any TSS (until/unless you want to use protection), etc. You also get a clean way of hiding they underlying physical address space, which avoids things like memory fragmentation problems, working around "holes", working around "maximum memory" restrictions, etc. In the short term learning how to use paging is harder than not learning how to use paging; but it avoids a huge amount of hassle in the long run.a5498828 wrote:thank you for answer, real mode is easier for me because the same argument (near vs far) can be taken into protected and even long mode. And there you have thinkg like privilege levels, gates, tasks, and other thing. Paging is even more complicated. Real mode is just the most basic of it all, and when wrote right will work on cpus from 1970s
If something takes 1 second on a modern OS running on a modern CPU, then it's likely that it'd take days on an 8086 (especially when you have to swap large amounts of code and data to/from disk just to get past the "lack of memory" problems). Even something as simple as a decoding a 1024*768 bitmap would be a huge nightmare.a5498828 wrote:I just wonder how fast would code execute on oldest 8086 wich on core2duo is done in 1 second.
Heh - sorry. The "rel8" addressing is only for JMP.a5498828 wrote:there is 1 byte relative call?A relative call is mostly just a near call with a shorter opcode (e.g. a 1-byte "distance from IP" stored in the instruction rather than a 2-byte "new value of IP" stored in the instruction).Im talking about x86, forgot to mention.
Makes me wonder how many people write position independent real mode code..
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: usefullness of seg:off pointers
Not true. The OpenWatcom compiler supports segmentation without the need for an underlaying DOS system. It also supports the flat, small (and recently compact) memory model for 32-bit applications, with no assumption about an underlaying DOS API. The compact memory model can also be optimized by always letting DS point to DGROUP, and thus avoiding far pointers on local data.Brendan wrote:Of course if you have to handle segmentation all over the place, and have to handle all the different memory models, and have to figure out what to do with free memory fragmentation and other problems, and have to write your own compiler/s because the only compilers that support real mode and segmentation also expect DOS;
The same thing can be said about using a flat memory model vs using a protected memory model with enforced segment isolation. It will take longer to code, and it might be slightly slower, but in the end will contain fewer bugs because of the design choice. There is always a trade-off between speed-effort-protection.Brendan wrote:I know it seems like real mode is "easy", and in the beginning it is. With protected mode and paging the protection stuff could be ignored, segmentation can be ignored, all the gates (except "interrupt trap gates") aren't strictly necessary, you wouldn't need any TSS (until/unless you want to use protection), etc. You also get a clean way of hiding they underlying physical address space, which avoids things like memory fragmentation problems, working around "holes", working around "maximum memory" restrictions, etc. In the short term learning how to use paging is harder than not learning how to use paging; but it avoids a huge amount of hassle in the long run.
-
- Member
- Posts: 30
- Joined: Wed Jan 13, 2010 7:59 am
- Location: Germany / Nuernberg
Re: usefullness of seg:off pointers
Hello,
Greetings
Erik
Full ACK, and i think the speed-loose for segmentation is really small and you have a chance to become additional speed through a better design (flexible memory sharing could be faster with segments).rdos wrote:The same thing can be said about using a flat memory model vs using a protected memory model with enforced segment isolation. It will take longer to code, and it might be slightly slower, but in the end will contain fewer bugs because of the design choice. There is always a trade-off between speed-effort-protection.
Greetings
Erik
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: usefullness of seg:off pointers
Segment loads are vector path operations (i.e. done completely in order), so are themselves quite slow. Segments with non-zero bases combined with SIB addressing adds an extra cycle to instructions (because the AGU can't do four additions in one). Relative branches all have a latency addition of 1 (branches have zero latency with segbase=0), far jumps have a minimum latency of 20 cycles IIRC, and are unpredicted, etc, etc...ErikVikinger wrote:Full ACK, and i think the speed-loose for segmentation is really small and you have a chance to become additional speed through a better design (flexible memory sharing could be faster with segments).rdos wrote:The same thing can be said about using a flat memory model vs using a protected memory model with enforced segment isolation. It will take longer to code, and it might be slightly slower, but in the end will contain fewer bugs because of the design choice. There is always a trade-off between speed-effort-protection.
Performance wise, segmentation is bad news.
(Source: AMD K8 & K10 optimization guides. Intel are much less helpful)
Re: usefullness of seg:off pointers
No wonder that AMD is losing the race against Intel. Providing an adder in hardware costs a few transistors, and such ICs were among the first digital ICs constructed.Owen wrote:Segment loads are vector path operations (i.e. done completely in order), so are themselves quite slow. Segments with non-zero bases combined with SIB addressing adds an extra cycle to instructions (because the AGU can't do four additions in one). Relative branches all have a latency addition of 1 (branches have zero latency with segbase=0), far jumps have a minimum latency of 20 cycles IIRC, and are unpredicted, etc, etc...
Performance wise, segmentation is bad news.
(Source: AMD K8 & K10 optimization guides. Intel are much less helpful)
Re: usefullness of seg:off pointers
Again, a market thing: If "no one" in the market still uses segmented memory (niche OS's nonwithstanding), why bother to optimize for it?
I bet you a case of beer that Intel doesn't have those additional transistors, either.
I bet you a case of beer that Intel doesn't have those additional transistors, either.
Every good solution is obvious once you've found it.
Re: usefullness of seg:off pointers
But the real issue is that checking if base is 0 takes a similar amount of time as doing an addition, so this is clearly a bad design. What they could mean is that 64-bit long mode (which force segment base to 0) is faster because they could discard adding the segment base, and not that a 32-bit OS that sets up non-zero bases for segments will execute slower, but I don't know. Anyway, I provided a fix for this on CPUs that support VME-flag, so on those systems flat memory model applications will use a 0 base.
This is more politics than engineering. Seems like big companies like Microsoft today can get their favorite optimizations into hardware, while leaving competing designs in software. This is their way of boasting about how fast code they have done, when they in reality has some really bloated code that should run slow as hell.
This is more politics than engineering. Seems like big companies like Microsoft today can get their favorite optimizations into hardware, while leaving competing designs in software. This is their way of boasting about how fast code they have done, when they in reality has some really bloated code that should run slow as hell.
Re: usefullness of seg:off pointers
*cough* *sputter*rdos wrote:This is their way of boasting about how fast code they have done...
Microsoft? Fast code?
Have they ever claimed such a thing?

Every good solution is obvious once you've found it.
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
Re: usefullness of seg:off pointers
Performance wise, TLB flushes and pagewalks are bad news. YMMV.Owen wrote:Performance wise, segmentation is bad news.
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: usefullness of seg:off pointers
You can check if the gate is 0 at segment register load time. You have to add for every address generation. Once the register is loaded, you distribute a bit around the sequencer saying whether SIB addresses need an extra segment base add cycle.rdos wrote:But the real issue is that checking if base is 0 takes a similar amount of time as doing an addition
Re: usefullness of seg:off pointers
Ever heard of parallell adding? There is no need do those adds in sequence, rather keep a block of as many adders as are necesary and perform them in parallell. Unused inputs are set to 0.Owen wrote:You can check if the gate is 0 at segment register load time. You have to add for every address generation. Once the register is loaded, you distribute a bit around the sequencer saying whether SIB addresses need an extra segment base add cycle.
Re: usefullness of seg:off pointers
In the early eighties the processor in a PC ran at 4.77MHz. So if you assume that a two core processor is the equivalent of a single core running at 4GHz, something which would take the modern processor 1 second would have taken an 8086 about a quarter of an hour. Back in those days I used to hang a computer with:I just wonder how fast would code execute on oldest 8086 wich on core2duo is done in 1 second.
Code: Select all
xor cx,cx
label:
push cx
xor cx,cx
loop $
pop cx
loop label