Page 2 of 2

Re: Context Switch and Paging - Does IRET Care About Paging?

Posted: Fri Jun 15, 2018 3:07 am
by simeonz
Brendan wrote:My approach is for the micro-kernel to ask a motherboard driver what to do during boot ("always ignore", "always kernel panic", "ask motherboard driver what to do with each NMI"); where if the motherboard driver selects the last option and an NMI happens the motherboard driver can do anything it likes (including nothing) before telling the kernel to ignore the NMI, or the motherboard driver can just tell the kernel to panic for that NMI (and can provide a more specific reason).
Ok. Thanks. For enthusiast level OS without third party support, I suppose the options for what to do (log, panic, ignore) can be given to the user and set for each known NMI type separately in configuration somewhere, like what Linux does. Also, the OS could print message on the screen and ask the user, if they are willing to take chances and continue work.

Re: Context Switch and Paging - Does IRET Care About Paging?

Posted: Fri Jun 15, 2018 8:55 am
by Korona
One (the only, if MCE is present?) important use case of NMIs is watchdog timers - using NMIs your system is able to report progress even if it has to block IRQs for extended periods of time. That is a very useful feature once you start debugging SMP lockups - it has saved me numerous long debugging sessions.

In this context, it would be interesting how CLI performs compared to MOV CR8. If the MOV is sufficiently fast to replace most uses of CLI in the kernel, this feature could be implemented, with less hassles, using a "soft NMI" mechanism similar to what Brendan suggested.

Re: Context Switch and Paging - Does IRET Care About Paging?

Posted: Fri Jun 15, 2018 9:15 am
by rwosdev
Any ideas why this is getting messed up? Some type of page fault is happening. Memory constants prefixed with SCM_ are globally mapped in the kernel's page directory and all process's page directories.

All emulators crash and Bochs gives me this:

Code: Select all

00049502853i[CPU0  ] | EAX=83e58955  EBX=00004004  ECX=00000000  EDX=00000000
00049502853i[CPU0  ] | ESP=ffc01f88  EBP=00d92000  ESI=00000000  EDI=00000000
00049502853i[CPU0  ] | IOPL=0 id vip vif ac vm RF nt of df if tf sf ZF af PF cf
00049502853i[CPU0  ] | SEG sltr(index|ti|rpl)     base    limit G D
00049502853i[CPU0  ] |  CS:0008( 0001| 0|  0) 00000000 ffffffff 1 1
00049502853i[CPU0  ] |  DS:0023( 0004| 0|  3) 00000000 ffffffff 1 1
00049502853i[CPU0  ] |  SS:0010( 0002| 0|  0) 00000000 ffffffff 1 1
00049502853i[CPU0  ] |  ES:0023( 0004| 0|  3) 00000000 ffffffff 1 1
00049502853i[CPU0  ] |  FS:0023( 0004| 0|  3) 00000000 ffffffff 1 1
00049502853i[CPU0  ] |  GS:0023( 0004| 0|  3) 00000000 ffffffff 1 1
00049502853i[CPU0  ] | EIP=ffc04068 (ffc04068)
00049502853i[CPU0  ] | CR0=0xe0000031 CR2=0x00507848
00049502853i[CPU0  ] | CR3=0x00d86000 CR4=0x00000000
00049502853i[CPU0  ] 0x00000000ffc04068>> iret  : CF
00049502853p[CPU0  ] >>PANIC<< exception(): 3rd (14) exception with no resolution

Code: Select all

; This code is globally mapped to the address SCM_TASK_START
align 0x1000
.startNewUserTask:
	mov ax, USER_DATASEG
	mov ds, ax
	mov es, ax
	mov fs, ax
	mov gs, ax
	
	; Copy stack to this task's kernel stack
	push KPROCESS_KSTACK
	push DWORD [ebx+KTask.owner]
	call DWORD [KernelGetInfoForProcess]
	mov edi, eax
	add edi, 0x1000 - StateInfo_size
	
	mov eax, esp
	push StateInfo_size
	push eax
	push edi
	call DWORD [_MemCopy]
	
	; Switch CR3 and switch stacks
	push KPROCESS_CR3
	push DWORD [ebx+KTask.owner]
	call DWORD [KernelGetInfoForProcess]
	mov edi, eax
	
	mov cr3, eax
	mov esp, SCM_KERNEL_STACK + 0x1000 - StateInfo_size
	mov eax, esp
	
	push USER_DATASEG
	push DWORD [eax+StateInfo.esp]
	push 0x200 ; No flags just interrupt enable
	push USER_CODESEG
	push DWORD [eax+StateInfo.eip]
	
	mov ebp, DWORD [eax+StateInfo.ebp]
	mov eax, DWORD [eax+StateInfo.eip]
	mov eax, DWORD [eax] ; Just to see value references correct data in Bochs, which it does
	
	xor ecx, ecx
	xor edx, edx
	xor esi, esi
	xor edi, edi
	
    iret

Re: Context Switch and Paging - Does IRET Care About Paging?

Posted: Sun Jun 17, 2018 6:52 am
by rwosdev
I rewrote the whole mapping system, but iret here now only seems to work when everything is mapped in.

I can't just have the EXE mapped in and a user stack and all the globally mapped things and a bunch of non-present spaces between pages, there has to be a 1:1 map between at least 0 - and a bit after the end of the EXE image or it doesn't work.

E.g. with the 0x6000 byte EXE loaded at 0xD7F000

Code: Select all

	mov edx, DWORD [esi+PEOptionalHeader.ImageBase]
	add edx, DWORD [esi+PEOptionalHeader.SizeOfImage]
	add edx, 0x16000 ; Must be this or higher, or won't work. Why?
	push 1 ; User
	push edx ; Size
	push 0 ; Virt
	push 0 ; Phys
	push ebx ; Page Directory
	call MapAddressSpace
Why is that??? :?

Re: Context Switch and Paging - Does IRET Care About Paging?

Posted: Sun Jun 17, 2018 8:35 am
by simeonz
rwosdev wrote:Why is that???
Are you setting the user stack pointer at the end of the stack block or the start... I see that you are setting the kernel stack pointer correctly, but you may have messed it up for the user one.

Re: Context Switch and Paging - Does IRET Care About Paging?

Posted: Sun Jun 17, 2018 10:05 am
by rwosdev
I didn't include it in that code above but even when I link in the user stack (from the bottom) it still crashes.

So I map kernel stack, user stack, interrupt handler stubs (which work), PE executable, anything I'm missing?

Re: Context Switch and Paging - Does IRET Care About Paging?

Posted: Sun Jun 17, 2018 1:24 pm
by simeonz
rwosdev wrote:I didn't include it in that code above but even when I link in the user stack (from the bottom) it still crashes.

So I map kernel stack, user stack, interrupt handler stubs (which work), PE executable, anything I'm missing?
Well, I suppose, you have mapped the kernel code as well. Aside from that, nothing that I can think of at the moment.

But note that since you get triple fault, the problem is not only in your user mapping. If it was, your page fault handler would have been called instead. But you get double and then triple fault, which means that your kernel mapping is problematic. You may consider keeping your entire kernel address space for the time being, until you resolve the user mode issue.

In that regard, you could try to use the Bochs debugger. Check out the wiki here and see if it helps. There is an option for debugging triple faults there.

Re: Context Switch and Paging - Does IRET Care About Paging?

Posted: Sun Jun 17, 2018 2:36 pm
by rwosdev
I checked my mapping code. Doing this:

Code: Select all

	push 1
	push 0x1000
	push 0
	push 0
	push eax
	call MapAddressSpace
	
	push 1
	push 0xFFFFFFFF - 0x1000
	push 0x1000
	push 0x1000
	push eax
	call MapAddressSpace
Should have the same effect as mapping 0x0-0xFFFFFFFF directly, yet causes the user program to crash on launch, so as you suggested, there is at least a problem there. I'm going to write the mapping code in C instead of assembly and see if I can spot the actual issue. Usually the longer I spend on a problem the more pathetic it turns out to be :roll:

Re: Context Switch and Paging - Does IRET Care About Paging?

Posted: Mon Jun 18, 2018 4:52 am
by rwosdev
So I wrote it all in C to see where I was going wrong, made some changes and put them back in assembly and the mapping seems to work better now.

If I also identity map my GDT to every process, Bochs gives a different error:

Code: Select all

00059503701i[CPU0  ] | EAX=fbfbfbfb  EBX=00000000  ECX=00000000  EDX=00000000
00059503701i[CPU0  ] | ESP=00d9b000  EBP=00d9b000  ESI=00000000  EDI=00000000
00059503701i[CPU0  ] | IOPL=0 id vip vif ac vm RF nt of df IF tf sf zf af pf cf
00059503701i[CPU0  ] | SEG sltr(index|ti|rpl)     base    limit G D
00059503701i[CPU0  ] |  CS:001b( 0003| 0|  3) 00000000 ffffffff 1 1
00059503701i[CPU0  ] |  DS:0023( 0004| 0|  3) 00000000 ffffffff 1 1
00059503701i[CPU0  ] |  SS:0023( 0004| 0|  3) 00000000 ffffffff 1 1
00059503701i[CPU0  ] |  ES:0023( 0004| 0|  3) 00000000 ffffffff 1 1
00059503701i[CPU0  ] |  FS:0023( 0004| 0|  3) 00000000 ffffffff 1 1
00059503701i[CPU0  ] |  GS:0023( 0004| 0|  3) 00000000 ffffffff 1 1
00059503701i[CPU0  ] | EIP=00d814b6 (00d814b6)
00059503701i[CPU0  ] | CR0=0xe0000031 CR2=0x00509040
00059503701i[CPU0  ] | CR3=0x00d8d000 CR4=0x00000000
00059503701i[CPU0  ] 0x0000000000d814b6>> jmp .-2 (0x00d814b6) : EBFE
00059503701p[CPU0  ] >>PANIC<< exception(): 3rd (14) exception with no resolution
The thread I'm trying to run is just jmp $, all the segments are now usermode, it makes no kernel accesses why am I still getting a fault?

Re: Context Switch and Paging - Does IRET Care About Paging?

Posted: Mon Jun 18, 2018 5:37 am
by simeonz
rwosdev wrote:If I also identity map my GDT to every process, Bochs gives a different error
Right. The GDT, IDT, and LDT must be mapped. The TSS must also be mapped.

Re: Context Switch and Paging - Does IRET Care About Paging?

Posted: Mon Jun 18, 2018 7:11 am
by rwosdev
Okay, I've now mapped the IDT and TSS (don't have an LDT). Still crashes but CR2 is different on fault and the address is much closer to the executable.

Code: Select all

 EAX=00000000  EBX=00000000  ECX=00000000  EDX=00000000
00059813603i[CPU0  ] | ESP=00000000  EBP=00000000  ESI=00000000  EDI=00000000
00059813603i[CPU0  ] | IOPL=0 id vip vif ac vm RF nt of df IF tf sf zf af pf cf
00059813603i[CPU0  ] | SEG sltr(index|ti|rpl)     base    limit G D
00059813603i[CPU0  ] |  CS:001b( 0003| 0|  3) 00000000 ffffffff 1 1
00059813603i[CPU0  ] |  DS:0023( 0004| 0|  3) 00000000 ffffffff 1 1
00059813603i[CPU0  ] |  SS:0023( 0004| 0|  3) 00000000 ffffffff 1 1
00059813603i[CPU0  ] |  ES:0023( 0004| 0|  3) 00000000 ffffffff 1 1
00059813603i[CPU0  ] |  FS:0023( 0004| 0|  3) 00000000 ffffffff 1 1
00059813603i[CPU0  ] |  GS:0023( 0004| 0|  3) 00000000 ffffffff 1 1
00059813603i[CPU0  ] | EIP=00000000 (00000000)
00059813603i[CPU0  ] | CR0=0xe0000031 CR2=0x00d80008
00059813603i[CPU0  ] | CR3=0x00d8f000 CR4=0x00000000
00059813603i[CPU0  ] 0x0000000000000000: (instruction unavailable) page not present
00059813603p[CPU0  ] >>PANIC<< exception(): 3rd (14) exception with no resolution
Anything else I'm missing? So far I've mapped interrupt stubs, user task start & resume, GDT, IDT, TSS, process's kernel stack and user stack

Re: Context Switch and Paging - Does IRET Care About Paging?

Posted: Mon Jun 18, 2018 7:25 am
by Brendan
Hi,
rwosdev wrote:Okay, I've now mapped the IDT and TSS (don't have an LDT). Still crashes but CR2 is different on fault and the address is much closer to the executable.

Code: Select all

CR0=0xe0000031 CR2=0x00d80008
00059867040i[CPU0  ] | CR3=0x00d8f000 CR4=0x00000000
00059867040i[CPU0  ] 0x0000000000d834b6>> jmp .-2 (0x00d834b6) : EBFE
Anything else I'm missing? So far I've mapped interrupt stubs, user task start & resume, GDT, IDT, TSS, process's kernel stack and user stack
If you're doing a "jmp $" in user-space, then something must be interrupting it (an IRQ), and the page fault/s must be trigged by CPU trying to access things it needs to start the interrupt handler/s. The things CPU needs to start the interrupt handler/s are the TSS, the IDT, the GDT, then the kernel stack (pointed to by the "SS0:ESP0" fields of the TSS).

It shouldn't be too hard to put a breakpoint (e.g. the "xchg ebx,ebx" magic breakpoint) just before the "jmp $" and use the debugger to inspect the TSS, IDT, GDT (e.g. make sure they're mapped into the virtual address space properly, etc); and should be easy for you to figure out what the problem is.

EDIT: I think you changed you post while I was typing.. :oops:

From the latest Bochs log; you should be able to put a breakpoint just before the return to user-space and single step (while inspecting the stack) to determine why it ended up with "EIP=0x00000000".


Cheers,

Brendan

Re: Context Switch and Paging - Does IRET Care About Paging?

Posted: Mon Jun 18, 2018 7:39 am
by rwosdev
Just got it! Issue was I wasn't mapping in the TSS per process, only once on kernel initialization.

Many thanks simeonz and Brendan. Though I have a feeling something similar is going to come up again, hopefully not too badly

Re: Context Switch and Paging - Does IRET Care About Paging?

Posted: Sat Jun 23, 2018 3:45 pm
by rwosdev
I rewrote all this work in a cleaner fashion and improved interrupt stubs. I've got user mode programs loaded in a totally virtual address space now, and can relocate them anywhere. Tested rigorously, seems solid with no signs of the problem re-occurring.

I've dabbled with OS development and thinking about concepts on and off for about 10 years but never made a serious effort until the past year. This to me is a real milestone, thanks so much guys, esp. simeonez and Brendan for your help!