Switching to virtual memory management

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Switching to virtual memory management

Post by Brendan »

Hi,
psychobeagle12 wrote:Is it possible that I am going about this whole thing the wrong way? I feel like this is a fairly basic task of most systems and that I shouldn't bother continuing if I can't figure this part out!
To me, it seems like you're mostly going about it the right way - e.g. having minor bugs in the implementation (and not having large problems with the design).

For an example, how does the ".startup" section get loaded at 0x0007E000 by multi-boot?


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
psychobeagle12
Member
Member
Posts: 41
Joined: Wed Oct 26, 2011 9:31 am

Re: Switching to virtual memory management

Post by psychobeagle12 »

Brendan wrote: To me, it seems like you're mostly going about it the right way - e.g. having minor bugs in the implementation (and not having large problems with the design).

For an example, how does the ".startup" section get loaded at 0x0007E000 by multi-boot?
I don't think I understand the question. GRUB parses the ELF executable and determines the load addresses for each section, correct? At least, again, that was my understanding. So, based on my readelf output, GRUB will correctly load .startup to 0x7e000, .text to 0x00100000, and .data, .bss to 0x00102000. Again, I think I am misunderstanding the question, or maybe the point of the question. I am sure that there is something quite important in your response, my mind just isn't putting it together...
My i386-based kernel: https://github.com/bmelikant/missy-kernel
Picking a name for my kernel was harder than picking my wife, so I just used her name until I decide!
User avatar
eryjus
Member
Member
Posts: 286
Joined: Fri Oct 21, 2011 9:47 pm
Libera.chat IRC: eryjus
Location: Tustin, CA USA

Re: Switching to virtual memory management

Post by eryjus »

Code: Select all

mov ebx,0x10000 | PAGE_PRIVILEGE

CRAP!!! I totally missed this the first time through.. and the second, and the third.... :mrgreen:

You're missing a 0.
Adam

The name is fitting: Century Hobby OS -- At this rate, it's gonna take me that long!
Read about my mistakes and missteps with this iteration: Journal

"Sometimes things just don't make sense until you figure them out." -- Phil Stahlheber
psychobeagle12
Member
Member
Posts: 41
Joined: Wed Oct 26, 2011 9:31 am

Re: Switching to virtual memory management

Post by psychobeagle12 »

eryjus wrote:

Code: Select all

mov ebx,0x10000 | PAGE_PRIVILEGE

CRAP!!! I totally missed this the first time through.. and the second, and the third.... :mrgreen:

You're missing a 0.
I WAS missing a zero lol. I corrected this several posts ago (it was a typo.) See? Every time I think I have it beat, it still fails...
My i386-based kernel: https://github.com/bmelikant/missy-kernel
Picking a name for my kernel was harder than picking my wife, so I just used her name until I decide!
User avatar
eryjus
Member
Member
Posts: 286
Joined: Fri Oct 21, 2011 9:47 pm
Libera.chat IRC: eryjus
Location: Tustin, CA USA

Re: Switching to virtual memory management

Post by eryjus »

Bummer! I thought that was it. This has become as much a quest for me as it is for you!

What is bothering me is the value in CR2. This does not reconcile with the value in EIP and the code in that address. For a page fault, CR2 is the address causing the fault to occur and EIP is the point in the code. The value in eax/CR2 is well beyond what you have mapped in your page tables. I'm honestly debating on whether you are getting a page fault or GPF leading to the triple fault.

Someone smarter than me is going to have to chime in with a big shove in the right direction.
Adam

The name is fitting: Century Hobby OS -- At this rate, it's gonna take me that long!
Read about my mistakes and missteps with this iteration: Journal

"Sometimes things just don't make sense until you figure them out." -- Phil Stahlheber
psychobeagle12
Member
Member
Posts: 41
Joined: Wed Oct 26, 2011 9:31 am

Re: Switching to virtual memory management

Post by psychobeagle12 »

That was one of the first problems I noticed as well, the completely odd value of CR2. I didn't think that the GPF following PF would change CR2. Or does it? I felt that the issue was PF->GPF->TF since there are two references in the bochs output to

Code: Select all

interrupt(): vector must be within IDT table limits, IDT.limit = 0x0
Just seemed to make sense given the fact that paging had just been enabled.

Edit: I'm not the only one who, looking at this code, thinks that my way of setting up paging should work, right? Like I'm not on some wild goose chase to a solution to a deeper problem?
My i386-based kernel: https://github.com/bmelikant/missy-kernel
Picking a name for my kernel was harder than picking my wife, so I just used her name until I decide!
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Switching to virtual memory management

Post by Brendan »

Hi,
psychobeagle12 wrote:
Brendan wrote:For an example, how does the ".startup" section get loaded at 0x0007E000 by multi-boot?
I don't think I understand the question. GRUB parses the ELF executable and determines the load addresses for each section, correct? At least, again, that was my understanding. So, based on my readelf output, GRUB will correctly load .startup to 0x7e000, .text to 0x00100000, and .data, .bss to 0x00102000. Again, I think I am misunderstanding the question, or maybe the point of the question. I am sure that there is something quite important in your response, my mind just isn't putting it together...
Sadly the multi-boot specification itself says nothing about which load addresses are valid and which aren't; and doesn't explicitly say that you can't ask to be loaded at (e.g.) physical address 0x00000000 (and trash the BDA) or 0x0009C000 (and trash the EBDA) or 0x000C0000 (video ROM) or 0x000F0000 (BIOS ROM).

My understanding is that (for multi-boot), everything in the first 1 MiB of memory is not guaranteed to be usable (e.g. there's no guarantee that 0x0007E000 isn't in use by the boot loader itself), and the only safe load addresses are 0x00100000 or higher (but not too much higher as there's no guarantee that the computer has enough RAM either).


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Switching to virtual memory management

Post by Brendan »

Hi,
psychobeagle12 wrote:That was one of the first problems I noticed as well, the completely odd value of CR2. I didn't think that the GPF following PF would change CR2. Or does it?
GPF doesn't change CR2.

Have you tried putting a magic breakpoint ("xchg ebx,ebx") just before the "jmp 0x08:paging_code" instruction, and then inspecting the page directory, page tables, contents of RAM at (virtual address) 0xC0101500, etc; before the crash occurs?


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: Switching to virtual memory management

Post by Combuster »

CR2=CR0 is a rather typical. Consider:

Code: Select all

mov eax, cr0
or eax, cr0_pg
mov cr0, eax
(...)
EAX now contains the value of CR0. When paging kicks in and the mapping is off, the following code looks like empty memory instead:

Code: Select all

db, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
That disassembles to a long sequence of add [eax], al; add [eax], al; etc. We also know that eax equals CR0, making this a very specific symptom that there actually is a page table for that address (otherwise, CR2=EIP), but that your page table points at the wrong location.
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
psychobeagle12
Member
Member
Posts: 41
Joined: Wed Oct 26, 2011 9:31 am

Re: Switching to virtual memory management

Post by psychobeagle12 »

I was actually only paying attention to the bochs log. I'll add the magic breakpoint and run a debugging session, see what I can find out. So now, operating under the assumption (unverified until tested) that my page tables are off, is there anything incorrect in the code that jumps out as causing the inconsistency?

Edit:

Ok, I did a memory dump right after the crash (bochs breaks at triple-fault) and the page tables are all correct, as is the page directory. All of the mappings as far as I can tell are correct. Here is a dump of page table 0 (not the whole thing):

Code: Select all

0x0007f000 <bogus+       0>:	0x00000003	0x00001003	0x00002003	0x00003003
0x0007f010 <bogus+      16>:	0x00004003	0x00005003	0x00006003	0x00007003
0x0007f020 <bogus+      32>:	0x00008003	0x00009003	0x0000a003	0x0000b003
0x0007f030 <bogus+      48>:	0x0000c003	0x0000d003	0x0000e003	0x0000f003
0x0007f040 <bogus+      64>:	0x00010003	0x00011003	0x00012003	0x00013003
0x0007f050 <bogus+      80>:	0x00014003	0x00015003	0x00016003	0x00017003
0x0007f060 <bogus+      96>:	0x00018003	0x00019003	0x0001a003	0x0001b003
0x0007f070 <bogus+     112>:	0x0001c003	0x0001d003	0x0001e003	0x0001f003
0x0007f080 <bogus+     128>:	0x00020003	0x00021003	0x00022003	0x00023003
0x0007f090 <bogus+     144>:	0x00024003	0x00025003	0x00026003	0x00027003
0x0007f0a0 <bogus+     160>:	0x00028003	0x00029003	0x0002a003	0x0002b003
0x0007f0b0 <bogus+     176>:	0x0002c003	0x0002d003	0x0002e003	0x0002f003
0x0007f0c0 <bogus+     192>:	0x00030003	0x00031003	0x00032003	0x00033003
0x0007f0d0 <bogus+     208>:	0x00034003	0x00035003	0x00036003	0x00037003
0x0007f0e0 <bogus+     224>:	0x00038003	0x00039003	0x0003a003	0x0003b003
0x0007f0f0 <bogus+     240>:	0x0003c003	0x0003d003	0x0003e003	0x0003f003
0x0007f100 <bogus+     256>:	0x00040003	0x00041003	0x00042003	0x00043003
0x0007f110 <bogus+     272>:	0x00044003	0x00045003	0x00046003	0x00047003
0x0007f120 <bogus+     288>:	0x00048003	0x00049003	0x0004a003	0x0004b003
0x0007f130 <bogus+     304>:	0x0004c003	0x0004d003	0x0004e003	0x0004f003
0x0007f140 <bogus+     320>:	0x00050003	0x00051003	0x00052003	0x00053003
0x0007f150 <bogus+     336>:	0x00054003	0x00055003	0x00056003	0x00057003
0x0007f160 <bogus+     352>:	0x00058003	0x00059003	0x0005a003	0x0005b003
0x0007f170 <bogus+     368>:	0x0005c003	0x0005d003	0x0005e003	0x0005f003
0x0007f180 <bogus+     384>:	0x00060003	0x00061003	0x00062003	0x00063003
0x0007f190 <bogus+     400>:	0x00064003	0x00065003	0x00066003	0x00067003
0x0007f1a0 <bogus+     416>:	0x00068003	0x00069003	0x0006a003	0x0006b003
0x0007f1b0 <bogus+     432>:	0x0006c003	0x0006d003	0x0006e003	0x0006f003
0x0007f1c0 <bogus+     448>:	0x00070003	0x00071003	0x00072003	0x00073003
0x0007f1d0 <bogus+     464>:	0x00074003	0x00075003	0x00076003	0x00077003
0x0007f1e0 <bogus+     480>:	0x00078003	0x00079003	0x0007a003	0x0007b003
0x0007f1f0 <bogus+     496>:	0x0007c003	0x0007d003	0x0007e003	0x0007f003
0x0007f200 <bogus+     512>:	0x00080003	0x00081003	0x00082003	0x00083003
0x0007f210 <bogus+     528>:	0x00084003	0x00085003	0x00086003	0x00087003
0x0007f220 <bogus+     544>:	0x00088003	0x00089003	0x0008a003	0x0008b003
0x0007f230 <bogus+     560>:	0x0008c003	0x0008d003	0x0008e003	0x0008f003
0x0007f240 <bogus+     576>:	0x00090003	0x00091003	0x00092003	0x00093003
0x0007f250 <bogus+     592>:	0x00094003	0x00095003	0x00096003	0x00097003
0x0007f260 <bogus+     608>:	0x00098003	0x00099003	0x0009a003	0x0009b003
0x0007f270 <bogus+     624>:	0x0009c003	0x0009d003	0x0009e003	0x0009f003
0x0007f280 <bogus+     640>:	0x000a0003	0x000a1003	0x000a2003	0x000a3003
0x0007f290 <bogus+     656>:	0x000a4003	0x000a5003	0x000a6003	0x000a7003
0x0007f2a0 <bogus+     672>:	0x000a8003	0x000a9003	0x000aa003	0x000ab003
0x0007f2b0 <bogus+     688>:	0x000ac003	0x000ad003	0x000ae003	0x000af003
0x0007f2c0 <bogus+     704>:	0x000b0003	0x000b1003	0x000b2003	0x000b3003
0x0007f2d0 <bogus+     720>:	0x000b4003	0x000b5003	0x000b6003	0x000b7003
0x0007f2e0 <bogus+     736>:	0x000b8003	0x000b9003	0x000ba003	0x000bb003
0x0007f2f0 <bogus+     752>:	0x000bc003	0x000bd003	0x000be003	0x000bf003
0x0007f300 <bogus+     768>:	0x000c0003	0x000c1003	0x000c2003	0x000c3003
0x0007f310 <bogus+     784>:	0x000c4003	0x000c5003	0x000c6003	0x000c7003
0x0007f320 <bogus+     800>:	0x000c8003	0x000c9003	0x000ca003	0x000cb003
0x0007f330 <bogus+     816>:	0x000cc003	0x000cd003	0x000ce003	0x000cf003
0x0007f340 <bogus+     832>:	0x000d0003	0x000d1003	0x000d2003	0x000d3003
0x0007f350 <bogus+     848>:	0x000d4003	0x000d5003	0x000d6003	0x000d7003
0x0007f360 <bogus+     864>:	0x000d8003	0x000d9003	0x000da003	0x000db003
0x0007f370 <bogus+     880>:	0x000dc003	0x000dd003	0x000de003	0x000df003
0x0007f380 <bogus+     896>:	0x000e0003	0x000e1003	0x000e2003	0x000e3003
0x0007f390 <bogus+     912>:	0x000e4003	0x000e5003	0x000e6003	0x000e7003
0x0007f3a0 <bogus+     928>:	0x000e8003	0x000e9003	0x000ea003	0x000eb003
0x0007f3b0 <bogus+     944>:	0x000ec003	0x000ed003	0x000ee003	0x000ef003
0x0007f3c0 <bogus+     960>:	0x000f0003	0x000f1003	0x000f2003	0x000f3003
0x0007f3d0 <bogus+     976>:	0x000f4003	0x000f5003	0x000f6003	0x000f7003
0x0007f3e0 <bogus+     992>:	0x000f8003	0x000f9003	0x000fa003	0x000fb003
0x0007f3f0 <bogus+    1008>:	0x000fc003	0x000fd003	0x000fe003	0x000ff003
0x0007f400 <bogus+    1024>:	0x00100023	0x00101003	0x00102003	0x00103003
0x0007f410 <bogus+    1040>:	0x00104003	0x00105003	0x00106003	0x00107003
0x0007f420 <bogus+    1056>:	0x00108003	0x00109003	0x0010a003	0x0010b003
0x0007f430 <bogus+    1072>:	0x0010c003	0x0010d003	0x0010e003	0x0010f003
0x0007f440 <bogus+    1088>:	0x00110003	0x00111003	0x00112003	0x00113003
0x0007f450 <bogus+    1104>:	0x00114003	0x00115003	0x00116003	0x00117003
0x0007f460 <bogus+    1120>:	0x00118003	0x00119003	0x0011a003	0x0011b003
0x0007f470 <bogus+    1136>:	0x0011c003	0x0011d003	0x0011e003	0x0011f003
0x0007f480 <bogus+    1152>:	0x00120003	0x00121003	0x00122003	0x00123003
0x0007f490 <bogus+    1168>:	0x00124003	0x00125003	0x00126003	0x00127003
0x0007f4a0 <bogus+    1184>:	0x00128003	0x00129003	0x0012a003	0x0012b003
0x0007f4b0 <bogus+    1200>:	0x0012c003	0x0012d003	0x0012e003	0x0012f003
0x0007f4c0 <bogus+    1216>:	0x00130003	0x00131003	0x00132003	0x00133003
0x0007f4d0 <bogus+    1232>:	0x00134003	0x00135003	0x00136003	0x00137003
0x0007f4e0 <bogus+    1248>:	0x00138003	0x00139003	0x0013a003	0x0013b003
0x0007f4f0 <bogus+    1264>:	0x0013c003	0x0013d003	0x0013e003	0x0013f003
0x0007f500 <bogus+    1280>:	0x00140003	0x00141003	0x00142003	0x00143003
0x0007f510 <bogus+    1296>:	0x00144003	0x00145003	0x00146003	0x00147003
0x0007f520 <bogus+    1312>:	0x00148003	0x00149003	0x0014a003	0x0014b003
0x0007f530 <bogus+    1328>:	0x0014c003	0x0014d003	0x0014e003	0x0014f003
0x0007f540 <bogus+    1344>:	0x00150003	0x00151003	0x00152003	0x00153003
0x0007f550 <bogus+    1360>:	0x00154003	0x00155003	0x00156003	0x00157003
0x0007f560 <bogus+    1376>:	0x00158003	0x00159003	0x0015a003	0x0015b003
0x0007f570 <bogus+    1392>:	0x0015c003	0x0015d003	0x0015e003	0x0015f003
0x0007f580 <bogus+    1408>:	0x00160003	0x00161003	0x00162003	0x00163003
0x0007f590 <bogus+    1424>:	0x00164003	0x00165003	0x00166003	0x00167003
0x0007f5a0 <bogus+    1440>:	0x00168003	0x00169003	0x0016a003	0x0016b003
0x0007f5b0 <bogus+    1456>:	0x0016c003	0x0016d003	0x0016e003	0x0016f003
0x0007f5c0 <bogus+    1472>:	0x00170003	0x00171003	0x00172003	0x00173003
0x0007f5d0 <bogus+    1488>:	0x00174003	0x00175003	0x00176003	0x00177003
0x0007f5e0 <bogus+    1504>:	0x00178003	0x00179003	0x0017a003	0x0017b003
0x0007f5f0 <bogus+    1520>:	0x0017c003	0x0017d003	0x0017e003	0x0017f003
0x0007f600 <bogus+    1536>:	0x00180003
and also a small portion of pt 768:

Code: Select all

0x00080000 <bogus+       0>:	0x00100003	0x00101003	0x00102003	0x00103003
0x00080010 <bogus+      16>:	0x00104003	0x00105003	0x00106003	0x00107003
0x00080020 <bogus+      32>:	0x00108003	0x00109003	0x0010a003	0x0010b003
0x00080030 <bogus+      48>:	0x0010c003	0x0010d003	0x0010e003	0x0010f003
0x00080040 <bogus+      64>:	0x00110003	0x00111003	0x00112003	0x00113003
0x00080050 <bogus+      80>:	0x00114003	0x00115003	0x00116003	0x00117003
0x00080060 <bogus+      96>:	0x00118003	0x00119003	0x0011a003	0x0011b003
0x00080070 <bogus+     112>:	0x0011c003	0x0011d003	0x0011e003	0x0011f003
0x00080080 <bogus+     128>:	0x00120003	0x00121003	0x00122003	0x00123003
0x00080090 <bogus+     144>:	0x00124003	0x00125003	0x00126003	0x00127003
and the directory table entries (0 and 768):

Code: Select all


0x0007e000 <bogus+       0>:	0x0007f023
0x0007ec00 <bogus+       0>:	0x00080023

My i386-based kernel: https://github.com/bmelikant/missy-kernel
Picking a name for my kernel was harder than picking my wife, so I just used her name until I decide!
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Switching to virtual memory management

Post by Brendan »

Hi,

Bit 5 of a page table entry is an "accessed" bit. When the CPU accesses the page, it sets the bit.

Code: Select all

0x0007f400 <bogus+    1024>:	0x00100023	0x00101003	0x00102003	0x00103003
The first page table entry here indicates that you've accessed something in the identity mapped area, in the page at 0x00100000.

Page directory entries have a similar "accessed" bit.

Code: Select all

0x0007e000 <bogus+       0>:   0x0007f023
0x0007ec00 <bogus+       0>:   0x00080023
Both of these were accessed. We already know something in the identity mapped area was accessed. What was accessed in kernel space? None of the page table entries shown have the "accessed" bit set, but I can only see 40 of them. This implies that whatever was accessed was not in the range from 0xC0000000 to 0xC0037FFF, but would've been in the range 0xC0038000 to 0xC03FFFFF. The address 0xC0101510 is in that range.

If we assume the CPU did access 0xC0101510 then you'd have to wonder what is at that virtual address. Does it contain a "jmp $" instruction, or does it contain an "add [eax], al" instruction (because the virtual page is full of zeroes)? If it did contain a "jmp $" then there's no likely way that it could've caused an exception (the only possible way would be an NMI).

All of this suggests that Combuster is right - the CPU successfully accesses the page but it's full of zeros, so the CPU gets an "add [eax], al" instruction, the value in EAX is still the same value you loaded into CR0 (e.g. 0xE0000011), so the CPU tries to add 0x11 to the value at (virtual address) 0xE0000011 and this trigger the page fault (due to "page not present").

If that is the case; the next question is why does the page (at physical address 0x00100000) contains zeros. Was it loaded properly and then trashed by something; or was it not loaded properly?


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
psychobeagle12
Member
Member
Posts: 41
Joined: Wed Oct 26, 2011 9:31 am

Re: Switching to virtual memory management

Post by psychobeagle12 »

Ah, I see now. This is becoming much clearer. I'll look into the page at 0xC0101510!

edit: If I am correct, the pte I am interested in is at 0x80404:

Code: Select all

0x00080400 <bogus+    1024>:	0x00200003	0x00201003	0x00202023	0x00203003
The page was accessed by the processor! So my page most likely IS being trashed!! Ok, problem narrowed down. Now to figure out why my page is being trashed. One last time though, it ISN'T a problem with the mapping code, right? I am mapping everything correctly?

Edit: I just had a further AHA moment! My virtual address is mapping to 0x201000, and the code is loaded at physical address 0x101000!!! I'll fix it :) Thank you guys so much for helping me bounce this around!

Edit (edit): Made the following changes:

Code: Select all

	mov eax,PAGE_TABLE_768_ADDR+0x400
	mov ebx,0x100000 | PAGE_PRIVILEGE
	mov ecx,PAGE_TABLE_ENTRIES-256
The kernel bootstrap is working now :D Again, a thousand thanks!
My i386-based kernel: https://github.com/bmelikant/missy-kernel
Picking a name for my kernel was harder than picking my wife, so I just used her name until I decide!
Post Reply