C Compiler for Unreal Mode

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
alexfru
Member
Member
Posts: 1112
Joined: Tue Mar 04, 2014 5:27 am

C Compiler for Unreal Mode

Post by alexfru »

I just implemented support for unreal mode in my Smaller C compiler.

Naturally, unreal .EXEs should be loadable by my Bootprog boot sector for FAT12/16/32.

c0du.asm will set up unreal mode for you, but A20 is your responsibility.

(And there's a DOS version of the C library for unreal mode too, if anyone cares.)
Techel
Member
Member
Posts: 215
Joined: Fri Jan 30, 2015 4:57 pm
Location: Germany
Contact:

Techel

Post by Techel »

Always nice to see your project progressing.!
alexfru
Member
Member
Posts: 1112
Joined: Tue Mar 04, 2014 5:27 am

Re: Techel

Post by alexfru »

Techel wrote:Always nice to see your project progressing.!
Thanks! I'd like to hear more user reports, though. :)
User avatar
BenLunt
Member
Member
Posts: 941
Joined: Sat Nov 22, 2014 6:33 pm
Location: USA
Contact:

Re: C Compiler for Unreal Mode

Post by BenLunt »

Hi Alex,

I have a few questions, and I know you know where I am coming from here :-). For the rest of you, I have created a modified version of Alex's compiler, with his permission and my gratitude, and added my own version of unreal mode to it. http://www.fysnet.net/nbc.htm

I then use this compiler to build my loader file from several .c files. It works quite well. However, and Alex this is the reason for my questions, I have an issue that comes up when I have code or data past the 1Meg mark and the BIOS gets called.

For example, if my ESP value is greater than 0xFFFF, some BIOSes will fail. I say some, because the BIOS should declare its own stack and restore my stack on return. However, some BIOSes use the "leave" instruction which messes up the EBP register (IIRC). Also, if I have code past the cs:0xFFFF mark and my Code Selector is BIG (uses EIP, not IP), the BIOS will/may cause a fault.

So, my question is, how do you handle the calling of the BIOS and still maintain an unreal mode environment? Can any selector by BIG and/or can any have a base/limit past the 1Meg mark?

For now, I have a wrapper that I call when I do any BIOS calls. It sets up a small stack within the 0xFFFF:FFFF limit, restoring my original stack (above 4Meg) on return. However, when a key is pressed, the BIOS'es IRQ handler is called, *not* my wrapper. (Yes, I could create my own IRQ handlers, but where do you draw the line between a loader and an actual OS? The loader has to stop somewhere...)

I am just wondering if SmallerC takes this into account or is it up to the coder to do this? Does SmallerC just set up a 4gig flat address space for all selectors, leaving the rest to the coder? Does SmallerC make the selectors BIG or small (use 32-bit addressing or 16-bit addressing)?

Thanks,
Ben
alexfru
Member
Member
Posts: 1112
Joined: Tue Mar 04, 2014 5:27 am

Re: C Compiler for Unreal Mode

Post by alexfru »

BenLunt wrote:...
I have an issue that comes up when I have code or data past the 1Meg mark and the BIOS gets called.

For example, if my ESP value is greater than 0xFFFF, some BIOSes will fail. I say some, because the BIOS should declare its own stack and restore my stack on return.
I guess, before we continue, I should say explicitly that my implementation assumes that program's code and stack reside below 1MB (640KB in most practical cases). CS:IP and SS:SP are set up and handled how you'd normally do it in real mode, none of the code or stack segments is 32-bit in any way (location, size or attributes). All of that is to coexist with BIOS and DOS most amicably.

Consider this C code:

Code: Select all

int neg(int a)
{
  return -a;
}

extern void nada(int* p);

void zilch(void)
{
  int dummy;
  nada(&dummy);
}
In both huge and unreal modes you now get:

Code: Select all

bits 16

section .text
	global	_neg
_neg:
	push	ebp
	movzx	ebp, sp
	mov	eax, [bp+8]
	neg	eax
	db	0x66
	leave
	retf

section .text
	global	_zilch
_zilch:
	push	ebp
	movzx	ebp, sp
	 sub	sp,          4
	xor	eax, eax
	mov	ax, ss
	shl	eax, 4
	lea	eax, [ebp+eax-4]
	push	eax
	db	0x9A ; call far seg:sel
section .relot
	dd	L6 ; relocation
section .text
L6:
	dd	_nada ; what to relocate and transform into seg:sel
	sub	sp, -4
	db	0x66
	leave
	retf
ESP>SP should not be a problem for the code generated by the compiler. If, OTOH, some BIOS doesn't like ESP>SP, you shouldn't force it to. It's not something that my (version of the) compiler does to upset such a BIOS. If it's how you set up yours, you probably need to redesign it. Like I said, having the most conventional setup (code & stack below 1MB) is a good way to keep BIOS and DOS happy.
BenLunt wrote: However, some BIOSes use the "leave" instruction which messes up the EBP register (IIRC).
Corrupted (E)BP by BIOS is a problem. But it's not something the compiler proper should deal with, IMO. I mean, if you invoke int 0x10/0x13/whatever and it screws up (E)BP, while it's bad, it's controllable and you can preserve EBP manually and you know where, when and how to.

If, OTOH, some BIOS IRQ ISR does that completely unexpectedly to the code that it preempts, I'd say fnck that BIOS, it's gone too far. But if you really really want things to work with the screwy BIOS, you have a few options...
1. Hook those ISRs and save/restore the full EBP.
2. Enable and disable interrupts in a way that when they're enabled you don't care about EBP going corrupted. Enable them for periods of time. What is your code doing when it's doing nothing and just waiting for keyboard input?
3. Have an option in the compiler to use just BP, much like I now only use SP in 16-bit-ish modes (I'm referring to the movzx ebp, sp above):

Change

Code: Select all

	xor	eax, eax
	mov	ax, ss
	shl	eax, 4
	lea	eax, [ebp+eax-4]
to

Code: Select all

	xor	eax, eax
	mov	ax, ss
	shl	eax, 4
	mozx	esi, bp
	lea	eax, [esi+eax-4]
ESI and EDI can be used as very short-term (or should that be instantaneous like instant coffee or smth?) temporaries.
BenLunt wrote: Also, if I have code past the cs:0xFFFF mark and my Code Selector is BIG (uses EIP, not IP), the BIOS will/may cause a fault.
Ouch. I'm afraid, you need to see a different specialist with that, I don't do that kind of unreal mode.
BenLunt wrote:...
So, my question is, how do you handle the calling of the BIOS and still maintain an unreal mode environment? Can any selector by BIG and/or can any have a base/limit past the 1Meg mark?

For now, I have a wrapper that I call when I do any BIOS calls. It sets up a small stack within the 0xFFFF:FFFF limit, restoring my original stack (above 4Meg) on return. However, when a key is pressed, the BIOS'es IRQ handler is called, *not* my wrapper. (Yes, I could create my own IRQ handlers, but where do you draw the line between a loader and an actual OS? The loader has to stop somewhere...)

I am just wondering if SmallerC takes this into account or is it up to the coder to do this? Does SmallerC just set up a 4gig flat address space for all selectors, leaving the rest to the coder? Does SmallerC make the selectors BIG or small (use 32-bit addressing or 16-bit addressing)?
1. Like I said, code and stack below 1MB (~640KB really). Is that not enough memory? For a loader? AFAIR, you require 96MB for the OS, so, there's another 95 for all loadable data and 32-bit code.
2. No funny business with CS:EIP, SS:ESP. All orthodox 16-bit here.
3. DS, ES, FS and GS (last two not necessary) have 4GB limits. Generated code assumes DS=ES=0. The library is in charge of maintaining zeroes there.
4. I have an ISR for IRQ5/#GP, which I use to restore unreal mode in case it somehow gets disabled. It may be fragile if you run arbitrary code (via system()) and that code may disable unreal mode and/or change IRQ5/#GP vector and not restore it and stuff like that, but it is what it is.

I really think you should shuffle things around a bit and not stretch unreal mode too much. I mean 16-bit code or stack above 1MB.

I'm still not sure what the exact problem it is with corrupted (E)BP/LEAVE. I'd like more understanding here.
User avatar
BenLunt
Member
Member
Posts: 941
Joined: Sat Nov 22, 2014 6:33 pm
Location: USA
Contact:

Re: C Compiler for Unreal Mode

Post by BenLunt »

alexfru wrote:I really think you should shuffle things around a bit and not stretch unreal mode too much. I mean 16-bit code or stack above 1MB.
I do need to do some revamping of the code. Everything works as expected at the moment, but a (somewhat) re-write may be in order so that I can get past the stack issue and the EIP > 0xFFFF issue.
alexfru wrote:I'm still not sure what the exact problem it is with corrupted (E)BP/LEAVE. I'd like more understanding here.
A recent poster in another thread led me to http://www.os2museum.com/wp/if-you-ente ... not-leave/ where it states:
Intel claims that in 16-bit code (i.e. default operand size being 16 bits) with a 32-bit stack, ENTER will move ESP to EBP. Since the LEAVE instruction moves EBP back to ESP, that makes sense. Sadly, while the LEAVE instruction really does move EBP to ESP (when running with a 32-bit stack), the ENTER instruction only updates BP.
Virtual Box uses ENTER/LEAVE while every other emulator and real hardware I have does not.

This is the current issue I am having.

Thanks for the explanation of what SmallerC does. I was just wondering if it took advantage of unreal mode and more than 1Meg of RAM.

Thanks,
Ben
alexfru
Member
Member
Posts: 1112
Joined: Tue Mar 04, 2014 5:27 am

Re: C Compiler for Unreal Mode

Post by alexfru »

BenLunt wrote:
alexfru wrote:I'm still not sure what the exact problem it is with corrupted (E)BP/LEAVE. I'd like more understanding here.
A recent poster in another thread led me to http://www.os2museum.com/wp/if-you-ente ... not-leave/ where it states:
Intel claims that in 16-bit code (i.e. default operand size being 16 bits) with a 32-bit stack, ENTER will move ESP to EBP. Since the LEAVE instruction moves EBP back to ESP, that makes sense. Sadly, while the LEAVE instruction really does move EBP to ESP (when running with a 32-bit stack), the ENTER instruction only updates BP.
Virtual Box uses ENTER/LEAVE while every other emulator and real hardware I have does not.

This is the current issue I am having.
OK, got it now. It's not just about how many bits are transferred between EBP and ESP with those two instructions and whether they increment/decrement SP or ESP. If you look at the pseudo code for ENTER and LEAVE, you can see that they push/pop a different number of bytes of EBP depending on whether the stack is 32-bit or 16-bit and therefore the stack contents and layouts differ. So, the big bit in the stack segment descriptor has more exposure in your setup than you can afford because you don't control the instructions used in BIOSes. You now have a strong argument against what you're currently doing in your loader.
BenLunt wrote: Thanks for the explanation of what SmallerC does. I was just wondering if it took advantage of unreal mode and more than 1Meg of RAM.
Just data (e.g. VESA LFB), not code or stack, sorry.
Post Reply