running 32-bit code in LM64

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
rdos
Member
Member
Posts: 3264
Joined: Wed Oct 01, 2008 1:55 pm

Re: running 32-bit code in LM64

Post by rdos »

devc1 wrote:I think that the only way for you now is to implement compatibility mode, it should not take a lot right ?
By the time this forum was created you could have implemented it LOL :)
There is no need to implement compatibility mode. The applications I have are all C++ based and can be recompiled for 64-bit mode. I can also make a gradual move to long mode if I wish too. I already have a driver that can load long mode device drivers, and I can (at least theoretically) run long mode applications based on GNU tools & newlib under the current design.
Octocontrabass
Member
Member
Posts: 5488
Joined: Mon Mar 25, 2013 7:01 pm

Re: running 32-bit code in LM64

Post by Octocontrabass »

rdos wrote:The most common problems are overwriting buffers, using objects after they are freed and double frees, all which can cause memory corruption in random areas. I don't want this in my OS kernel, so I'm absolutely not writing just another flat-memory model long mode kernel. There has to be some method to avoid this to make it an interesting project. For protected mode, the method is segmentation.
Fortunately for you, most people have been working on OSes that use flat address spaces, so you can look at methods other OSes use to combat these issues for inspiration in your hypothetical future 64-bit OS. Off the top of my head, I can think of runtime instrumentation and rewriting the code in a safer language, but maybe some other method will be more interesting for you.
rdos
Member
Member
Posts: 3264
Joined: Wed Oct 01, 2008 1:55 pm

Re: running 32-bit code in LM64

Post by rdos »

Octocontrabass wrote:
rdos wrote:The most common problems are overwriting buffers, using objects after they are freed and double frees, all which can cause memory corruption in random areas. I don't want this in my OS kernel, so I'm absolutely not writing just another flat-memory model long mode kernel. There has to be some method to avoid this to make it an interesting project. For protected mode, the method is segmentation.
Fortunately for you, most people have been working on OSes that use flat address spaces, so you can look at methods other OSes use to combat these issues for inspiration in your hypothetical future 64-bit OS. Off the top of my head, I can think of runtime instrumentation and rewriting the code in a safer language, but maybe some other method will be more interesting for you.
Having this during debug is hardly enough. It will need to be turned-on all the time, which would slow down the code. It won't catch overwrites of static data either, nor overwrites of code.
Octocontrabass
Member
Member
Posts: 5488
Joined: Mon Mar 25, 2013 7:01 pm

Re: running 32-bit code in LM64

Post by Octocontrabass »

rdos wrote:Having this during debug is hardly enough. It will need to be turned-on all the time, which would slow down the code.
Right, that's why I suggested an option that will minimize the overhead from runtime safety checks.
rdos wrote:It won't catch overwrites of static data either, nor overwrites of code.
Intel released a successor to the 386 called the 486, and it adds a new bit to CR0 that you can set to enforce read-only pages at all privilege levels instead of only ring 3.
kerravon
Member
Member
Posts: 277
Joined: Fri Nov 17, 2006 5:26 am

Re: running 32-bit code in LM64

Post by kerravon »

devc1 wrote:
kerravon wrote:
devc1 wrote:It's 2023 and you guys are still worried about 32 bit mode [-(

Use 32 bit registers in 64 bit mode, it's as easy as it is.
Code written with such 32-bit overrides will end up doing 16-bit if run on an 80386. Not what I wish to do.
Recompile it in x64 architecture.
That's not my user requirement.

I want to be able to write my 80386 C code in 1986, compile it, lose the source code, and have my executable continue to run on every processor subsequent to the 80386, including the x64 in LM64.

No emulation. No recompilation.

Just a set of rules (road map) from AMD (in 1986) telling me what code my C compiler should generate to future-proof it.

Or if not AMD, then one of these professors/academics in some university could have seen the writing on the wall about a theoretical upgrade to 64 bit? ie we already had history of what happened from 8 to 16 bit, and 16 to 32. Does computer science not tell you what happens in the transition from 32 to 64?
kerravon
Member
Member
Posts: 277
Joined: Fri Nov 17, 2006 5:26 am

Re: running 32-bit code in LM64

Post by kerravon »

Octocontrabass wrote:
kerravon wrote:If you want to insist on your worldview, so be it - why isn't the MIPS (alone in the world supposedly)
MIPS is the only architecture I know of that doesn't require any mode changes to run all 32-bit software on a 64-bit CPU, but that doesn't mean there aren't others.
kerravon wrote:the right way to do things for all processors?
MIPS was probably just lucky.

On IBM mainframes, addresses were smaller than registers, and the unused upper bits were ignored in address calculations. On MIPS, addresses were the same size as registers, and the CPU's internal cache used every address bit even if external hardware ignored some upper address bits.
Apologies for the delay - other priorities and I wanted to give this topic the justice it deserves.

Ok, so in the 1960s IBM created the S/360. It had 16 general purpose registers (GPRs). All of them were 32-bit in size. None of them were 24 bit. None of them were 31 bit.

So if a GPR was used to hold an address, it was a 32-bit register that was holding the address. Most op codes operated on 32-bit values. None at all operated on 31 bits. I can only think of two (ICM and STCM) that operates on 24 (ie 8, 16, 24 or 32) bits.

If you look at the code generated by GCC 3.2.3 (original, from GNU) i370 target, you will find no assumptions whatsoever of the 32-bit registers holding an address of actually only being 24 or 31 or any other number.

How can this be considered anything other than 32-bit programming? Just because addresses happened to be masked (unconditionally in 1960s and 1970s)? That is nominally a cultural problem, not a technical deficit, if someone puts something other than 0 in the masked bits.

Then there is one exception. The S360/67 (ie from 1967). That was the first one that had an AMODE switch at all. And the AMODE was to switch to 32, not 31. ie no masking. ie a perfect 32-bit computer because there's nothing to quibble about (unless you want to quibble that it IPLs in 24-bit mode and the OS/loader/whatever needs to execute a few instructions to get into AM32).

But from then on, applications can be forced to execute (what I consider to be) correctly.

Similar to how I don't care about badly-behaved programs breaking from RM16 to PM16, I don't care about (what I consider to be) badly-behaved programs breaking during a switch from running on a machine with unconditional AM24 to a machine with an OS doing unconditional AM32.

In fact, from a technical perspective, you can simply throw every other 32-bit IBM machine in the bin and just look at the S360/67. IBM allowed undisputably 32-bit programs running on a S360/67 to run on a 64-bit z/Arch machine with no mode switch required. The ONLY consideration is the 4-8 GiB to 0-4 GiB virtual storage mapping that is required on all processors (it's computer science, not IBM's choice), to cope with negative indexes. I think I would have preferred to see this done in the hardware rather than relying on virtual storage. By default the machine (all 64-bit machines) should boot in a mode that does that mapping. Same applies to transitioning from 16-bit to 32-bit or 64-bit to 128-bit etc.

(this is just a suggestion that Babbage could maybe have made if he had thought about it enough. Or if not Babbage than someone else - before 1960 preferably).

So anyway - a properly-written program - and noting that ALL C code built with gcc i370 available since 1989:

C:\devel\gcc\gcc\config\i370>grep 198 i370.md
i370.md: ;; Copyright (C) 1989, 1993, 1994, 1995, 1997, 1998, 1999, 2000, 2002

C:\devel\gcc\gcc\config\i370>

is code that I consider to be "properly written", is inherently 32-bit, will ALSO run in an AM24 or AM31 environment. The code doesn't care if some address bits are being masked. The bit(s) should all be 0 anyway - at least if the code is written properly - as the i370-generated code is.

Let me stop here to keep this chain about IBM mainframes separate from the x86. I'll reply to the next bit next.
kerravon
Member
Member
Posts: 277
Joined: Fri Nov 17, 2006 5:26 am

Re: running 32-bit code in LM64

Post by kerravon »

Octocontrabass wrote: On x86, tons of opcodes were already assigned, and adding new 64-bit instructions without changing any opcodes would make all 64-bit instructions extremely long. On MIPS, every instruction is 32 bits, so there were plenty of free opcodes for new 64-bit instructions alongside the existing 32-bit instructions.
In hindsight, what should have been done at the time of the 8086?

They already had an issue changing from 8080 to 8086 - they forced everyone to recompile and potentially modify the source code because the op codes were all being reassigned. Should they have restricted themselves to say 128 of the 256 op codes, and then allow the 80386 to take 64 of the remaining 128, and then allow the x64 to take 32 of the remaining 64? ie allow room for expansion.
kerravon wrote:Specifically I think the MSDOS (actually PDOS/86) executables need to not have any functions that cross a 64k boundary.
That seems like an unnecessary limitation. Not all pointers have to be huge pointers.
A misunderstanding here.

I said functions, ie code. There is no "huge" for code, only for data. So I'm talking about medium or large (huge too, but its not important) memory model programs here.

And I'm talking about the linker doing some work. If there is a.obj, b.obj and c.obj, all 30k in size, then what I believe needs to be done is for a.obj and b.obj to be contiguous, then 4k of NULs, then c.obj.

I'm trying to minimize the changes required to the MZ executable format. Having some NULs doesn't change the format at all. However, I'm not sure if that is enough. That is my question. IBM/Microsoft came up with a "NE" format when they switched to the 80286. But they had a different goal to me. I'm trying to create a PM16 (or PM32 with D-bit set) OS designed to transparently run ("properly-written" - to be defined) RM16 MZ executables.

PDOS/286 will tile the entire 16 MiB (or more - it won't be tied to just the actual 80286) up to the availability of selectors (both GDT and LDT will be used). The selectors will be paired for code and data.

Note that medium/compact/large memory model C-generated code does not manipulate segments at all (not in normal C90-compliant code, anyway), so it is down to the OS (PDOS/286) to handle this properly as we transition from the microemacs RM16 executable only being able to edit files 640 KiB in size to suddenly being able to edit 16 MiB files. (obviously adjust for overhead).

So my question here is - can the MZ executable cope for my purposes or will I be forced to use NE or at least an MZ+ because I need something more. If I need an MZ+ can you suggest a way to do that without disrupting existing use of MZ, as I believe MZ was designed to allow extra fields to be added.

Now switching topics to huge memory model ...
Microsoft's implementation for the huge memory model in Windows involves a special variable set by the loader. When performing arithmetic on a huge pointer, programs use that variable to adjust the segment portion of the huge pointer. In real mode, the variable is 0x1000. In protected mode, it's something like 0x8 or 0x10.
By "loader" do you mean "linker"? Or both? I've seen that code (AHSHIFT/AHINCR) and don't know how it works. Does the linker set a special field in the executable to say "this needs to be zapped at runtime" and the default value is set to 0x1000 so that MSDOS works by default, but then a PM16 OS (OS/2 1.0) will inspect that field and zap it so that it works?
That doesn't follow Intel's rules of treating segments as completely opaque, but Microsoft got away with it because the 286 is older than Windows.
If I understand you correctly - Microsoft knew what the actual 80286 was, so they knew that they could use an algorithm that would work on at least the 8086 and 80286. But really they should have done a function call to manipulate a huge pointer, as Watcom does?

Note that Turbo C also does a function call, but you won't get that function call at all unless you explicitly use the "huge" keyword - simply using huge memory model doesn't make all the pointers huge (unlike both Microsoft and Watcom).

I even bought Borland C 5.0 a couple of weeks ago or whatever to prove that this was never fixed/changed.
kerravon wrote:I want exactly what you said above - follow Intel's rules and the definition of a selector suddenly changes, and your application doesn't care at all. Instead, your application says "whee - 16 MiB or 512 MiB instead of 640 KiB".
Huge pointer arithmetic is very simple if you follow Intel's rules: make the OS do it.

Unfortunately, I think that means you still need at least a little bit of conditional execution in your programs, since MS-DOS doesn't have a huge pointer arithmetic API.
I don't mind having some conditional execution for later versions of MSDOS, e.g. a theoretical MSDOS version 27.0 may have finally added an API to manipulate huge pointers. So in my MSDOS startup executable I can test for the existence of that API (rather than the version number), and then either call that API or assume 4-bit segment shifts of the 8086.

And because there won't be an MSDOS 27.0 (very unlikely, anyway), I can instead negotiate with the Freedos people to get that API added (or fork it, or simply make it PDOS/86-only).

Anything you would suggest with the benefit of hindsight? Something for the 1980s timeframe. I basically wish to recontest that. In fact I want to recontest the 1970s with CP/M and the 1960s mainframes.

BTW (switching topics, unrelated), my latest plan is to combine UC8086, UC386 and UCX64 into a single distribution with 3 sets of application (hexdump, pdas etc) executables, and my io.sys boot loader will detect what the CPU is and load PDOS16/32/64.SYS as appropriate (for legacy boot). On a UEFI-only system it would be bootx64.efi that gets executed, but both 64-bit versions of the PDOS operating system would execute the same (Win64) executables. The 32-bit version executes Win32 executables. And I'm thinking it should be possible to make PDOS/16 use the Win32/64 API too, including msvcrt.dll. But I am not going to have a real msvcrt.dll - it will instead be embedded in the OS (as it already is for UCX64).

This would be a deviation from normal MSDOS executables though - but hopefully I can support both (just as PDOS/386 supports two different APIs).
Octocontrabass
Member
Member
Posts: 5488
Joined: Mon Mar 25, 2013 7:01 pm

Re: running 32-bit code in LM64

Post by Octocontrabass »

kerravon wrote:I want to be able to write my 80386 C code in 1986, compile it, lose the source code, and have my executable continue to run on every processor subsequent to the 80386, including the x64 in LM64.
That's what compatibility mode is for. The same way 16-bit code must run in 16-bit mode under a 32-bit OS, 32-bit code must run in 32-bit compatibility mode under a 64-bit OS.
kerravon wrote:Just a set of rules (road map) from AMD (in 1986) telling me what code my C compiler should generate to future-proof it.
There are no rules. All code that would run in 32-bit protected mode in 1986 will run in 32-bit compatibility mode today.
kerravon wrote:Or if not AMD, then one of these professors/academics in some university could have seen the writing on the wall about a theoretical upgrade to 64 bit? ie we already had history of what happened from 8 to 16 bit, and 16 to 32. Does computer science not tell you what happens in the transition from 32 to 64?
On x86, the transition from 16-bit to 32-bit required a 32-bit OS to run 16-bit programs in 16-bit mode. It would have been very easy to predict that the transition from 32-bit to 64-bit would require a 64-bit OS to run 32-bit programs in some kind of 32-bit mode.
kerravon wrote:In hindsight, what should have been done at the time of the 8086?
Reserving opcodes for future expansion sounds nice, but it would have made 8086 programs bigger and slower, and future expansion wasn't much of a design goal when the i432 was supposed to replace the 8086. In hindsight, Intel should have given up on the i432 and figured out a MMU for the 8086.
kerravon wrote:And I'm talking about the linker doing some work. If there is a.obj, b.obj and c.obj, all 30k in size, then what I believe needs to be done is for a.obj and b.obj to be contiguous, then 4k of NULs, then c.obj.
That sounds extremely wasteful. I know 4kB isn't much on modern PCs, but it adds up quick on ancient PCs.
kerravon wrote:I'm trying to minimize the changes required to the MZ executable format.
I'm not sure any changes to the format are required. As long as there's a relocation entry for every segment reference in the binary, your loader can replace the existing segment values with protected mode segment selectors. I suppose you could add some kind of signature to differentiate your PDOS-compatible binaries from ones that don't follow the PDOS API, in case you want to avoid loading an incompatible program.

Since your loader knows exactly which segment bases the program expects, there's no need to add any padding to the binary.
kerravon wrote:PDOS/286 will tile the entire 16 MiB (or more - it won't be tied to just the actual 80286) up to the availability of selectors (both GDT and LDT will be used).
On the 286, you have more than enough selectors for some of them to be less than 64kB apart to fit whichever segment bases the program expects. On the 386, you can run a 32-bit program that isn't limited by 16-bit addressing.
kerravon wrote:By "loader" do you mean "linker"? Or both?
The variable is set by the loader, but I assume the linker has to set things up correctly so the loader can find the variable. I'm not an expert on 16-bit executable formats.
kerravon wrote:I've seen that code (AHSHIFT/AHINCR) and don't know how it works. Does the linker set a special field in the executable to say "this needs to be zapped at runtime" and the default value is set to 0x1000 so that MSDOS works by default, but then a PM16 OS (OS/2 1.0) will inspect that field and zap it so that it works?
I'm pretty sure it only works for NE executables.
kerravon wrote:If I understand you correctly - Microsoft knew what the actual 80286 was, so they knew that they could use an algorithm that would work on at least the 8086 and 80286. But really they should have done a function call to manipulate a huge pointer, as Watcom does?
Yes, but Microsoft also knew they could force Intel to maintain backwards compatibility.
kerravon wrote:I don't mind having some conditional execution for later versions of MSDOS, e.g. a theoretical MSDOS version 27.0 may have finally added an API to manipulate huge pointers. So in my MSDOS startup executable I can test for the existence of that API (rather than the version number), and then either call that API or assume 4-bit segment shifts of the 8086.
Microsoft never added a huge pointer API because they ended up supporting DPMI instead, and DPMI has a function you can call to get AHINCR. A huge pointer function call only makes sense if you're writing 8086 DOS software with no knowledge of the future 286.
rdos
Member
Member
Posts: 3264
Joined: Wed Oct 01, 2008 1:55 pm

Re: running 32-bit code in LM64

Post by rdos »

Octocontrabass wrote: There are no rules. All code that would run in 32-bit protected mode in 1986 will run in 32-bit compatibility mode today.
Only if it doesn't use callgates and only in userspace. That's part of the problem with the design. While the 16-bit to 32-bit transition was fully interoperable, this is not the case with compatibility mode.

Actually, the transition from real mode to V86 mode was far better and fully interoperable. To catch IO accesses in V86 mode, the IO permission bitmap was added. To emulate the interrupt flags, some instructions cause faults in V86 mode so they could be monitored and possibly emulated. It was possible to call real-mode code provided buffers resided in the first 1MB. The higher parts of the 32-bit registers were not clobbered by V86 mode.

Much of this doesn't work in compatibility mode. First, the IO permission map was discarded. Second, only 32-bit user mode code was supported, and long mode actually cannot call a function in compatibility mode without saving all registers and manipulating with returns. Call gates doesn't work either.
Octocontrabass
Member
Member
Posts: 5488
Joined: Mon Mar 25, 2013 7:01 pm

Re: running 32-bit code in LM64

Post by Octocontrabass »

rdos wrote:Only if it doesn't use callgates
Call gates still work in compatibility mode.
rdos wrote:and only in userspace.
A 64-bit OS does not need any 32-bit code outside of userspace.
rdos wrote:First, the IO permission map was discarded.
The I/O permission bit map still exists in long mode.
rdos wrote:Second, only 32-bit user mode code was supported,
Compatibility mode supports 16-bit user mode code.
rdos wrote:and long mode actually cannot call a function in compatibility mode without saving all registers and manipulating with returns.
That's no big deal. If you really dislike it that much, don't let code in long mode call code in compatibility mode.
rdos wrote:Call gates doesn't work either.
Call gates still work.

I think you should read the Intel or AMD manuals.
nullplan
Member
Member
Posts: 1760
Joined: Wed Aug 30, 2017 8:24 am

Re: running 32-bit code in LM64

Post by nullplan »

Octocontrabass wrote:Call gates still work in compatibility mode.
They also work in 64-bit mode. However, the direct far call instruction is no longer supported, so you will need to use the indirect one. As I recall, rdos was quite proud of his approach to system calls where he live patches the syscall number into the binary, and with indirect far calls this gets significantly more complicated. I actually once played with the idea of using call gates for system calls, and just the impossibility of backing up the instruction pointer for syscall restarts made me drop the idea. When the far call is executed, the OS only knows the address after the far call instruction, and finding out where it starts is as difficult as reading an x86 instruction stream backwards. Which is impossible in general.
Carpe diem!
Octocontrabass
Member
Member
Posts: 5488
Joined: Mon Mar 25, 2013 7:01 pm

Re: running 32-bit code in LM64

Post by Octocontrabass »

nullplan wrote:As I recall, rdos was quite proud of his approach to system calls where he live patches the syscall number into the binary, and with indirect far calls this gets significantly more complicated.
More complicated, but not impossible - but segmentation doesn't work in 64-bit mode, so there's no real reason to use call gates instead of the SYSCALL instruction.
rdos
Member
Member
Posts: 3264
Joined: Wed Oct 01, 2008 1:55 pm

Re: running 32-bit code in LM64

Post by rdos »

Ok, callgates are supported, but are not usable. First, they consume two GDT entries. Second, the target selector must be a 64-bit selector, and so it's not possible to link to a 32-bit server anyway.

I think I would rather patch the code to a push gate number and do a syscall. The syscall entry-point then will need to inspect the call stack of the user process, and do a jmp through a gate table.
Last edited by rdos on Mon Sep 25, 2023 1:24 pm, edited 1 time in total.
rdos
Member
Member
Posts: 3264
Joined: Wed Oct 01, 2008 1:55 pm

Re: running 32-bit code in LM64

Post by rdos »

nullplan wrote:
Octocontrabass wrote:Call gates still work in compatibility mode.
They also work in 64-bit mode. However, the direct far call instruction is no longer supported, so you will need to use the indirect one. As I recall, rdos was quite proud of his approach to system calls where he live patches the syscall number into the binary, and with indirect far calls this gets significantly more complicated. I actually once played with the idea of using call gates for system calls, and just the impossibility of backing up the instruction pointer for syscall restarts made me drop the idea. When the far call is executed, the OS only knows the address after the far call instruction, and finding out where it starts is as difficult as reading an x86 instruction stream backwards. Which is impossible in general.
I use a call to 0001:gate_number and 0002:gate_number. This will protection fault, and so the faulting instruction points to the start of the instruction to restart. Thus, no need to trace backwards.
kerravon
Member
Member
Posts: 277
Joined: Fri Nov 17, 2006 5:26 am

Re: running 32-bit code in LM64

Post by kerravon »

Octocontrabass wrote:
kerravon wrote:I want to be able to write my 80386 C code in 1986, compile it, lose the source code, and have my executable continue to run on every processor subsequent to the 80386, including the x64 in LM64.
That's what compatibility mode is for. The same way 16-bit code must run in 16-bit mode under a 32-bit OS, 32-bit code must run in 32-bit compatibility mode under a 64-bit OS.
No. I don't want that. I want a simple 64-bit processor.

With regard to 16-bit code requiring a special mode - I don't mind that for code that uses segments - that's a very different model. But I do expect pure non-segmented 16-bit code (ie tiny memory model) code to run in PM32. It would need to reside in the first 64k, and the next 64k would need to be mapped to the first 64k to handle negative indexes.
kerravon wrote:Just a set of rules (road map) from AMD (in 1986) telling me what code my C compiler should generate to future-proof it.
There are no rules. All code that would run in 32-bit protected mode in 1986 will run in 32-bit compatibility mode today.
Until CM32 is dropped. As V8086 was dropped. Luckily I never used V8086.

And even if you insist CM32 will never be dropped - *I* want to be able to drop it.
kerravon wrote:Or if not AMD, then one of these professors/academics in some university could have seen the writing on the wall about a theoretical upgrade to 64 bit? ie we already had history of what happened from 8 to 16 bit, and 16 to 32. Does computer science not tell you what happens in the transition from 32 to 64?
On x86, the transition from 16-bit to 32-bit required a 32-bit OS to run 16-bit programs in 16-bit mode. It would have been very easy to predict that the transition from 32-bit to 64-bit would require a 64-bit OS to run 32-bit programs in some kind of 32-bit mode.
MIPS didn't. The transition from S360/67 to z/Arch didn't.

Basically this is relying on hardware engineers to create a special mode because the software can't cope with registers and addressing increasing in size. So basically two CPUs in one. Because no-one could figure out how to do it with just one.
kerravon wrote:In hindsight, what should have been done at the time of the 8086?
Reserving opcodes for future expansion sounds nice, but it would have made 8086 programs bigger and slower, and future expansion wasn't much of a design goal when the i432 was supposed to replace the 8086. In hindsight, Intel should have given up on the i432 and figured out a MMU for the 8086.
Ok, what about reserving just two opcodes? One says "what follows is a new 32-bit instruction" and the other says "what follows is a new 64-bit instruction"? Or, let's say x'40' was chosen as the 32-bit opcode. Two x'40' in a row says 64-bit, 3 says 128-bit etc.
kerravon wrote:And I'm talking about the linker doing some work. If there is a.obj, b.obj and c.obj, all 30k in size, then what I believe needs to be done is for a.obj and b.obj to be contiguous, then 4k of NULs, then c.obj.
That sounds extremely wasteful. I know 4kB isn't much on modern PCs, but it adds up quick on ancient PCs.
I'm not trying to optimize my 1980s software for the 8086. I'm trying to future-proof it. Regardless, if you have 90k of executable code, I don't consider 4k extra to be a lot anyway.
kerravon wrote:I'm trying to minimize the changes required to the MZ executable format.
I'm not sure any changes to the format are required. As long as there's a relocation entry for every segment reference in the binary, your loader can replace the existing segment values with protected mode segment selectors. I suppose you could add some kind of signature to differentiate your PDOS-compatible binaries from ones that don't follow the PDOS API, in case you want to avoid loading an incompatible program.

Since your loader knows exactly which segment bases the program expects, there's no need to add any padding to the binary.
I don't need any special marking for PDOS - happy for things to blow up if people mismatch them. I'm not sure what you mean by "segment bases the program expects". Regardless, I may be running on a Turbo 186 with 8-bit segment shifts instead of 4. A similar situation happens with PM16. So, if code in a.obj calls code in b.obj, and the distance is only 50 bytes, then a near call will be made (I believe). But that distance is dependent on b.obj not moving. But on PM16 I need to move b.obj because a new selector is used. And the new selector will translate into a higher distance from a.obj. If PM16 had enough selectors (and there is nothing preventing a rival processor from doing exactly that - or using a 16-bit shift instead of 4-bit like the 8086 or 8-bit like the Turbo 186) to map 4 GiB, then each selector would be on a fresh 64k boundary.
kerravon wrote:PDOS/286 will tile the entire 16 MiB (or more - it won't be tied to just the actual 80286) up to the availability of selectors (both GDT and LDT will be used).
On the 286, you have more than enough selectors for some of them to be less than 64kB apart to fit whichever segment bases the program expects.
I don't want my program to dictate the distance between selectors, and I want my OS to tile the memory up to the maximum memory available, and have programs gracefully accept that.
On the 386, you can run a 32-bit program that isn't limited by 16-bit addressing.
I want to write microemacs for 16-bit MSDOS, and compile it with appropriate tools, lose the source code, then in 2024 run the executable in PM16 mode (I don't mind a mode switch here) on the latest AMD chip and be able to edit 4 GiB files.
kerravon wrote:I don't mind having some conditional execution for later versions of MSDOS, e.g. a theoretical MSDOS version 27.0 may have finally added an API to manipulate huge pointers. So in my MSDOS startup executable I can test for the existence of that API (rather than the version number), and then either call that API or assume 4-bit segment shifts of the 8086.
Microsoft never added a huge pointer API because they ended up supporting DPMI instead, and DPMI has a function you can call to get AHINCR. A huge pointer function call only makes sense if you're writing 8086 DOS software with no knowledge of the future 286.
I see. And the DPMI call is supported even on an 8086, so that's not an issue, right? And AHINCR is good enough to cover the Turbo 186 too, right? Sounds like I should be switching to this model. And support the appropriate DPMI call in PDOS/86 so that my apps built with Watcom C can use it. Actually - I wouldn't even need to make that call. I just need to start producing NE executables or whatever is required to get AHINCR zapped at load time.

That's probably it actually. Rather than add 4k of padding between a.obj and b.obj I probably need to switch to NE format. I don't know anything about it though. I potentially need just a subset of NE.

Or - can MZ be extended since I probably don't need a lot? Well - some sort of markup to avoid the 4k padding would be "a lot". Also the same consideration happens when trying to load DGROUP into ds. The current value that is loaded assumes a particular distance with 4-bit shift segments.

And switching to NE will solve another problem - I am interested in using the Win32 API (which is the same as Win64 - and I'm not sure about Win16) to have an msvcrt.dll on PDOS/86 so that I don't have to statically link the entire C library. So I will probably be breaking the compatibility anyway (although I can probably support both APIs anyway).

Note that the msvcrt.dll I use in PDOS (32 and 64) is built from PDPCLIB.
Post Reply