I can't figure out what to do after having activated paging.

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
ImSAMazing
Posts: 3
Joined: Fri Jul 05, 2019 11:43 am

I can't figure out what to do after having activated paging.

Post by ImSAMazing »

Hi guys,

I'm sorry if this comes across as a noob question, but I've gotten stuck in a major way and just don't know what to do next...
First of all, I managed to set up Meaty Skeleton succesfully, including writing a few functions like printing coloured messages, moving every line up by one in case of reaching the end of the buffer, stuff like that.
But, now I want to implement proper paging and for that I've looked all over the wiki but all the information is very spread-out and some pages barely contain info while others are all over the place in their info, so I've been having a hard time figuring it out from the wiki. I followed https://wiki.osdev.org/Higher_Half_x86_Bare_Bones this 'tutorial' for how to enable basic paging, which works (replacing the relevant parts of my Meaty Skeleton code with the code there does not stop my OS from booting and writing and stuff). I even get a value for _kernel_end, which corresponds to 75 according to my itoa function.

But, now my question: what next? Paging is turned on, but where do I create my code to manage pages? To get a memory address to use from a page? What kind of code should that be? I've tried messing around a little with pointers like

Code: Select all

int *ptr = (int*) 0xc00000000;
*ptr =3;
printf(itoa(*ptr,"",10));
but when I mess around with memory addresses like that, the OS just crashes on boot (which I guess is understandable due to R/W protection granted because paging is turned on...) Btw, i calculated that address from the fact that VGA was pointed at the 1024th page in the tutorial I linked above, so _vgaaddress_ - 1023*0x1000 = 0xc0000000, which then should be the address of the first page...

Again, I'm new to this whole memory management thing and I just can't get a firm enough understanding on my own of what I'm supposed to do now or if I'm going about it the wrong way...



Any help is appreciated guys, thanks!
Ethin
Member
Member
Posts: 625
Joined: Sun Jun 23, 2019 5:36 pm
Location: North Dakota, United States

Re: I can't figure out what to do after having activated pag

Post by Ethin »

That may not be the first address of your page at all. Right now the VGA buffer (if your using VGA) is still at 0xb8000. Until you map it to a page in memory (say, the address you wrote) you must still use 0xb8000. If your in 320x200 VGA mode, your address would be 0xa0000.
You have two options:
1. Set up paging, but use physical RAM. This is undesirable because it circumvents paging entirely and makes it useless.
2. Map the pages you want to read/write to into memory with (say) a memory allocator. That's probably what you should write now anyway, since you have paging enabled.
ImSAMazing
Posts: 3
Joined: Fri Jul 05, 2019 11:43 am

Re: I can't figure out what to do after having activated pag

Post by ImSAMazing »

Ethin wrote:That may not be the first address of your page at all. Right now the VGA buffer (if your using VGA) is still at 0xb8000. Until you map it to a page in memory (say, the address you wrote) you must still use 0xb8000. If your in 320x200 VGA mode, your address would be 0xa0000.
You have two options:
1. Set up paging, but use physical RAM. This is undesirable because it circumvents paging entirely and makes it useless.
2. Map the pages you want to read/write to into memory with (say) a memory allocator. That's probably what you should write now anyway, since you have paging enabled.
Thanks for responding! To answer your points
1. I did map it to that address, as it was explained to me in the tutorial.I changed my VGA code and it now prints to the new address at 0xC... which works (it prints what it's supposed to in the VGA screen)
2. So I should be looking for 'how to write a memory allocator'? If you have any links that'd be awesome

To clarify, this is my current boot.s file:

Code: Select all

# Declare constants for the multiboot header.
.set ALIGN,    1<<0             # align loaded modules on page boundaries
.set MEMINFO,  1<<1             # provide memory map
.set FLAGS,    ALIGN | MEMINFO  # this is the Multiboot 'flag' field
.set MAGIC,    0x1BADB002       # 'magic number' lets bootloader find the header
.set CHECKSUM, -(MAGIC + FLAGS) # checksum of above, to prove we are multiboot

# Declare a multiboot header that marks the program as a kernel.
.section .multiboot
.align 4
.long MAGIC
.long FLAGS
.long CHECKSUM

# Allocate the initial stack.
.section .bootstrap_stack, "aw", @nobits
stack_bottom:
.skip 16384 # 16 KiB
stack_top:

# Preallocate pages used for paging. Don't hard-code addresses and assume they
# are available, as the bootloader might have loaded its multiboot structures or
# modules there. This lets the bootloader know it must avoid the addresses.
.section .bss, "aw", @nobits
	.align 4096
boot_page_directory:
	.skip 4096
boot_page_table1:
	.skip 4096
# Further page tables may be required if the kernel grows beyond 3 MiB.

# The kernel entry point.
.section .text
.global _start
.type _start, @function
_start:
	# Physical address of boot_page_table1.
	# TODO: I recall seeing some assembly that used a macro to do the
	#       conversions to and from physical. Maybe this should be done in this
	#       code as well?
	movl $(boot_page_table1 - 0xC0000000), %edi
	# First address to map is address 0.
	# TODO: Start at the first kernel page instead. Alternatively map the first
	#       1 MiB as it can be generally useful, and there's no need to
	#       specially map the VGA buffer.
	movl $0, %esi
	# Map 1023 pages. The 1024th will be the VGA text buffer.
	movl $1023, %ecx

1:
	# Only map the kernel.
	cmpl $(_kernel_start - 0xC0000000), %esi
	jl 2f
	cmpl $(_kernel_end - 0xC0000000), %esi
	jge 3f

	# Map physical address as "present, writable". Note that this maps
	# .text and .rodata as writable. Mind security and map them as non-writable.
	movl %esi, %edx
	orl $0x003, %edx
	movl %edx, (%edi)

2:
	# Size of page is 4096 bytes.
	addl $4096, %esi
	# Size of entries in boot_page_table1 is 4 bytes.
	addl $4, %edi
	# Loop to the next entry if we haven't finished.
	loop 1b

3:
	# Map VGA video memory to 0xC03FF000 as "present, writable".
	movl $(0x000B8000 | 0x003), boot_page_table1 - 0xC0000000 + 1023 * 4

	# The page table is used at both page directory entry 0 (virtually from 0x0
	# to 0x3FFFFF) (thus identity mapping the kernel) and page directory entry
	# 768 (virtually from 0xC0000000 to 0xC03FFFFF) (thus mapping it in the
	# higher half). The kernel is identity mapped because enabling paging does
	# not change the next instruction, which continues to be physical. The CPU
	# would instead page fault if there was no identity mapping.

	# Map the page table to both virtual addresses 0x00000000 and 0xC0000000.
	movl $(boot_page_table1 - 0xC0000000 + 0x003), boot_page_directory - 0xC0000000 + 0
	movl $(boot_page_table1 - 0xC0000000 + 0x003), boot_page_directory - 0xC0000000 + 768 * 4

	# Set cr3 to the address of the boot_page_directory.
	movl $(boot_page_directory - 0xC0000000), %ecx
	movl %ecx, %cr3

	# Enable paging and the write-protect bit.
	movl %cr0, %ecx
	orl $0x80010000, %ecx
	movl %ecx, %cr0

	# Jump to higher half with an absolute jump. 
	lea 4f, %ecx
	jmp *%ecx

4:
	# At this point, paging is fully set up and enabled.

	# Unmap the identity mapping as it is now unnecessary. 
	movl $0, boot_page_directory + 0

	# Reload crc3 to force a TLB flush so the changes to take effect.
	movl %cr3, %ecx
	movl %ecx, %cr3

	# Set up the stack.
	mov $stack_top, %esp

	# Enter the high-level kernel.
	call kernel_main

	# Infinite loop if the system has nothing more to do.
	cli
1:	hlt
	jmp 1b

So yeah, I just have no clue how to proceed from here. How do I write a memory allocator? Why does my OS crash if I try to access a specific address? (Besides from the VGA address, that one does not crash, which I guess makes sense because if the screen can view it, why not the code... ) Paging is enabled (i think) so what now?

Again I followed this tutorial for the enabling of paging: https://wiki.osdev.org/Higher_Half_x86_Bare_Bones
User avatar
iansjack
Member
Member
Posts: 4706
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: I can't figure out what to do after having activated pag

Post by iansjack »

Ethin wrote:Until you map it to a page in memory (say, the address you wrote) you must still use 0xb8000.
That is incorrect. Until you map the address to a page in memory you can't use it at all. You can't just access the physical address when paging is enabled - you have to use a mapped logical address.
ImSAMazing
Posts: 3
Joined: Fri Jul 05, 2019 11:43 am

Re: I can't figure out what to do after having activated pag

Post by ImSAMazing »

iansjack wrote:
Ethin wrote:Until you map it to a page in memory (say, the address you wrote) you must still use 0xb8000.
That is incorrect. Until you map the address to a page in memory you can't use it at all. You can't just access the physical address when paging is enabled - you have to use a mapped logical address.
Hm, makes sense. So my code does successfully map the VGA address correctly, but how would I map other addresses for my OS to use? Without hardcoding them all into the boot assembly file...
Ethin
Member
Member
Posts: 625
Joined: Sun Jun 23, 2019 5:36 pm
Location: North Dakota, United States

Re: I can't figure out what to do after having activated pag

Post by Ethin »

@iansjack: I stand corrected then. Thank you.
User avatar
~
Member
Member
Posts: 1228
Joined: Tue Mar 06, 2007 11:17 am
Libera.chat IRC: ArcheFire

Re: I can't figure out what to do after having activated pag

Post by ~ »

Just remember that paging structures themselves (page directory, page tables) never need to be mapped themselves for paging to work, you only need to map them when you are going to access them, and then you can unmap them from special-purpose "Data View" visor pages, although you should mark those pages as used in the physical memory bitmap (you can know what those addresses are as now they will only be referenced in the paging structures themselves).

I've tested my code in Bochs and it works well in this way, with paging structures always at unmapped addresses.

I say this because it makes paging code much easiear because you don't need to keep paging tables, only actual data pages mapped (double mapping) when you run out of free mapped addresses.

You just look for free pages in a memory bitmap of the size of the installed memory, map it temporarily in a visor page only for this in the kernel, set entries and unmap for other allocations.

Also I also keep groups of 4 free contiguous physical pages of 4K.
The intention is to make it impossible to run out of enough contiguous unfragmented pages when needed, as now I always will have at least enough room to create a new page directory/page table/data page/free visor page or hardware block of memory whenever needed, of 4 pages (16384 bytes).


If I get fragmented pages, I plan to return a type of struct, if I get unfragmented pages, I plan to return another type of struct, for the physical-level bitmap-based pages, so I can produce unfragmented virtual addresses too.

I need to study if I need a minimum of 4 virtual contiguous pages, or groups of 64KB (16 pages) for making unbearable fragmentation of the virtual space impossible too. For virtual addresses, probably 16 contiguous pages (64K blocks) would be the best to keep compatibility with all low-level CPU modes and decent minimum data sizes.



See my kernel, I've working the whole year on this, at the end of it I will post a separate library for implementing very simple paging with bitmaps, malloc, free, realloc, or at least the supporting code for making it possible (also to copy/share data between page directories).
User avatar
Schol-R-LEA
Member
Member
Posts: 1925
Joined: Fri Oct 27, 2006 9:42 am
Location: Athens, GA, USA

Re: I can't figure out what to do after having activated pag

Post by Schol-R-LEA »

Oops, I didn't notice you earlier, sorry. I try to get this out to all new members, to get everyone started on the same footing; while I normally send it by PM, every once in a while I'll post it to the forum (especially if I have just updated it, as I did today). I hope this helps.
----------------------------------

The first thing I want to say is this: if you aren't already using version control for all software projects you are working on, drop everything and start to do that now. Set up a VCS such as Git, Subversion, Mercurial, Bazaar, or what have you - which you use is less important than the fact that you need to use it. Similarly, setting up your repos on an offsite host such as Gitlab, Github, Sourceforge, CloudForge, or BitBucket should be the very first thing you do whenever you start a new project, no matter how large or small it is.

If nothing else, it makes it easy to share your code with us on the forum, as you can just post a link, rather than pasting oodles and oodles of code into a post.

Once you have that out of the way (if you didn't already), you can start to consider the OS specific issues.

If you haven't already, I would strongly advise you to read the introductory material in the wiki:
After this, go through the material on the practical aspects of
running an OS-dev project: I strongly suggest that you read through these pages in detail, along with the appropriate ones to follow, before doing any actual development. These pages should ensure that you have at least the basic groundwork for learning about OS dev covered.

This brings you to your first big decision: which platform, or platforms, to target. Commonly options include:
  • x86 - the CPU architecture of the stock PC desktops and laptops, and the system which remains the 'default' for OS dev on this group. However, it is notoriously quirky, especially regarding Memory Segmentation, and the sharp divisions between 16-bit Real Mode, 16-bit and 32-bit Protected Modes, and 64-bit Long Mode. Despite this, it is generally the go-to architecture for high-performance systems, business machines, and gaming, largely thanks to the immense amount of effort and talent applied by Intel and AMD into continuously pushing it further and further. It also has a great advantage in being paired with the highly standardized PC system architecture, which is implemented by large number of different motherboard designers (whereas different ARM, MIPS, and RISC-V single-board computers can have radically different boot-up, memory, bus, and peripheral systems).
  • ARM - a RISC architecture widely used on mobile devices and for 'Internet of Things' and 'Maker' equipment, including the popular Raspberry Pi, Beagleboard, and Rock64 single board computers. The overwhelming majority of smartphones and tablets, including those by Apple and Samsung, use ARM processors, though most of these are locked down in various ways which would make them difficult to impossible to target for a third-party OS dev. While it is generally seen as easier to work with that x86, most notably in the much less severe differences in between the 32-bit and 64-bit modes and the lack of memory segmentation, the wiki and other resources don't cover it nearly as well (though this is changing over time as it becomes more commonly targeted). Because the architecture lends itself to low-power, low-heat applications, most of the implementations are less powerful than the top x86 implementations, with designs focusing on its use in areas where that advantage is greatest, though recently Amazon has developed a high-performance implementation which they claim is comparable to Intel's in terms of CPU speed.
  • MIPS, another RISC design which is slightly older than ARM. It is one of the first RISC design to come out, being part of the reason the idea caught on, and is even simpler than ARM in terms of programming, though a bit tedious when it comes to assembly programming. While it was widely used in workstations and game consoles in the 1990s, it has declined significantly due to mismanagement by the owners of the design, and at the present is mostly seen in devices such as routers. There are a handful of System on Chip single-board computers that use it, such as the Creator Board and the Onion Omega2, and manufacturers in both China and Russia have licensed the ISA with the idea of breaking their dependence on Intel. The ISA was made open-source in 2018, which may lead to more widespread use. Finding good information on the instruction set is easy, as it is widely used in courses on assembly language and computer architecture and there are several emulators that run MIPS code, but finding usable information on the actual hardware systems using it is often difficult at best.
  • RISC-V is an up and coming open source hardware ISA, closely related to MIPS in overall design, but so far is Not Ready For Prime Time. It is developing rapidly, however, and a handful of maker-grade SBCs using it have come on the market in 2018 and 2019, most notably the HiFive1. Also, the architecture is being adopted for a number of device-internal uses by companies such as Western Digital and nVidia, though not in a way that would be exposed to 3rd-party programmers for the most part. While it is poised to have a significant role in the future and is worth considering for a forward-looking project, it is too early to say what that impact will be and the situation is rapidly evolving.
You then need to decide which Language to use for the kernel. For most OS-Developers this means knowing and using C; while other languages can be used, it is important to know how to read C code, even if you don't use C, as most OS examples are written in it. You will also need to know at least some assembly language for the target platform, as there are always parts of the kernel and the device drivers which cannot be done in high-level languages.

You further need to choose the compiler, assembler, linker, build tool, and support utilities to use - what is called the 'toolchain' for your OS. For most platforms, there aren't many to choose from, and the obvious choice would be GCC and the Binutils toolchain due to their ubiquity. However, on the Intel x86 platform, it isn't as simple, as there are several other toolchains which are in widespread use for it, the most notable being the Microsoft one - a very familiar one to Windows programmers, but one which presents problems in OSDev. The biggest issue with Visual Studio, and with proprietary toolchains in general, is that using it rules out the possibility of your OS being "self-hosting" - that is to say, being able to develop your OS in the OS itself, something most OSdevs do want to eventually be able to do. The fact that Porting GCC to your OS is feasible, whereas porting proprietary x86 toolchains isn't, is a big factor in the use Binutils and GCC, as it their deep connection to Linux and other Unix derivatives.

Regardless of the high-level language you use for OS dev (if any), you will still need to use assembly language, which means choosing an assembler. If you are using Binutils and GCC, the obvious choice would be GAS, but for x86 especially, there are other assemblers which many OSdevs prefer, such as Netwide Assembler (NASM) and Flat Assembler (FASM).

The important thing here is that assembly language syntax varies more among the x86 assemblers than it does for most other platforms, with the biggest difference being that between the Intel syntax used in the majority of x86 assemblers, and the AT&T syntax used in GAS. You can see an overview of the differences on the somewhat misnamed wiki page Opcode syntax. While it is possible to coax GAS to use the Intel syntax using the .intel_syntax noprefix directive, the opposite is generally not true for Intel-based assemblers such as NASM, and even with that directive, GAS is still quite different from other x86 assemblers in other regards.

It is still important to understand that the various Intel syntax assemblers - NASM, FASM, and YASM among others - have differences in how they handle indexing, in the directives they use, and in their support for features such as macros and defining data structures. While most of these follow the general syntax of Microsoft Assembler (MASM), they all diverge from it in various ways.

Once you know which platform you are targeting, and the toolchain you want to use, you need to understand them. You should read up on the core technologies for the platform. Assuming that you are targeting the PC architecture, this would include: This leads to the next big decision: which Bootloader to use. There are a number of different standard bootloaders for x86, with the most prominent being GRUB. We strong recommend against Rolling Your Own Bootloader, but it is an option as well.

You need to consider what kind of File System to use. Common ones used when starting out in OS dev include: We generally don't recommend designing your own, but as with boot loaders, it is a possibility as well.

While this is a lot of reading, it simply reflects the due diligence that any OS-devver needs to go through in order to get anywhere. OS development, even as a simple project, is not amenable to the Stack Overflow cut-and-paste model of software development; you really need to understand a fair amount of the concepts and principles before writing any code, and the examples given in tutorials and forum posts generally are exactly that. Copying an existing code snippet without at least a basic idea of what it is doing simply won't do. While learning itself is an iterative process - you learn one thing, try it out, see what worked and what didn't, read some more, etc. - in this case a basic foundation is needed at the start. Without a solid understanding of at least some of the core ideas before starting, you simply can't get very far in OS dev.

Hopefully, this won't scare you off; it isn't nearly as bad as it sounds. It just takes a lot of patience and a bit of effort, a little at a time.

You might also want to peek at the various OS developer archtypes, such as Lino Commando, James T. Klik, and Alta Lang, for both amusement and to see how different people approach OS-Dev (and how other OS-Devs poke fun at them for it). Just steer clear of becoming a Dr. Duct von Tape, for your own sake and that of the others here, please.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
User avatar
eekee
Member
Member
Posts: 892
Joined: Mon May 22, 2017 5:56 am
Location: Kerbin
Discord: eekee
Contact:

Re: I can't figure out what to do after having activated pag

Post by eekee »

Schol-R-LEA wrote:This leads to the next big decision: which Bootloader to use. There are a number of different standard bootloaders for x86, with the most prominent being GRUB. We strong recommend against Rolling Your Own Bootloader, but it is an option as well.
I don't understand this. I've started a new thread for it: viewtopic.php?f=1&t=33769

I can understand the admonition against designing your own filesystem. I'm told it's perhaps the worst task in programming; the most likely to generate misery from the smallest mistakes.
Kaph — a modular OS intended to be easy and fun to administer and code for.
"May wisdom, fun, and the greater good shine forth in all your work." — Leo Brodie
Post Reply