Yes, almost. Again, you need a tiny bit of 64-bit code in your 32-bit trampoline, because you cannot jump beyond the 4GB barrier from Long Compatibility mode. So you have to switch to Long 64-bit mode first.finarfin wrote:So in the 64 bit world the move to the higher half can be done after i set everything up in 32bits world?
It is an option if you already need a GDT with 32-bit segments around. Then you can just add the 64-bit segments. The main kernel will need to load the final version after enabling paging, anyway.finarfin wrote:That means that my gdt should have 4 entries? 2 for 32 bit data and code segments and 2 for 64 bit version?
It does not really matter. You need to load the GDT before doing the far jump. Any time before that is fine. I added the enabling of paging to show that that is the mode switch. That point is when the CPU enters Long mode.finarfin wrote:I see that in your example, you enable paging after having loaded the GDT and not before, is that strictly necessary? or i can keep paging being enabled just before the gdt?
No, this is all in the lower half. Which is why that sequence ends with "jmp higher_half".finarfin wrote:When you do the jmp 8:next the instructions after the label, are already in the higher half?
I strongly suggest you reconsider that load address. Using it will require you to use the "large" code model, which will generate more complicated code. Moreover, you will have to write your assembler for that code model as well, which I would not recommend on a beginner. For instance, you cannot just call external functions. If you just used -2GB (or 0xFFFF'FFFF'8000'0000) as load address, you could use the "kernel" code model, which is much closer to the "small" code model you might already be used to. And the tooling will be better tested, since "kernel" is the code model used by tons of other kernels, including Linux.finarfin wrote:But then if i'm going to load the kernel at 0xffff800000000000 this means that the address can't stay 32bit register, so i suppose that the linker script should be different correct ? Because, from what i understood from @velko comment one of my problem is that i am using variables in asm like if they were still placed with the old addressing (not subtracting 0xC0000000), and in 32bit world i can't subtract a 64bit address (i suppose).
Besides that rant, sure you can get your assembler to do 64-bit subtraction in 32-bit mode, as long as the result fits.
Acceptable, although it maps the kernel differently from you.finarfin wrote:I think this can be a good example: https://github.com/charpointer/celesteo ... asm/boot.s?
The result of that calculation will be below 4G, and therefore fit. So no "relocation truncated to fit".finarfin wrote:There is only one thing not totally clear to me, all the initalization is done at 32 bit, ok got it, but when initalizing the data strcuture i see the subtractions like: p4_table - 0xFFFFFFFF80000000 that afaik is a 64bit value. Is that possible??
Apparently. I used my own 32-bit segments, just in case, but maybe that was overly cautious.finarfin wrote:Btw reading that example looks like that i can just need one code and one data segment of the gdt, is that correct?
Not overly so. In linker scripts, 4K and 0x1000 are different ways to write the same thing. The symbols he defines are set to physical addresses rather than virtual ones, whatever that gets him, and the KEEP() only kneecaps any chance of "--gc-sections" doing its job. Not that there is a big reason to use that switch on a kernel, where you control every last section that enters a kernel.finarfin wrote:Is there a big difference?
The choice of name rarely matters. And even having those labels is only needed if your kernel does not get loaded by a competent boot loader that fails to zero out the BSS section. In that case you have to do it yourself, and that is very easy with those symbols.finarfin wrote: (and i see that in some case labels like _sbss, section_data_end are added, and in other projects not. Is there a reason for them? (or maybe are just depending on some of the developer choices?)
Some people like to use "_end" as the symbol at the end of their kernel, and use it to initialize their allocators. Which I personally dislike, because you don't know how much space after _end is unused before you hit something worth preserving. Or just end of memory. Never mind MMIO, there's a chance of a memory hole somewhere in there. Beyond that, the world is your oyster. You only define the symbols you need. That said, these linearly mapped kernels can just use linker symbols as stand-ins for physical addresses. I cannot do that in my kernel, since it will be loaded at an unpredictable address. Or maybe not even just one address, maybe it is in pieces distributed over all the memory. Virtual memory makes it all seem contiguous in VM space, and the only need I have to care about their physical addresses is that I must not overwrite them when allocating physical memory.
The stack must be reloaded after enabling paging because the interpretation of the stack pointer has changed, and you don't want to have the stack remain in the lower half. For stack size, the only advice is "enough". Most start with 8K and see how far they get. I'm using 12K and a 4K guard page (allocated at run time from the stage 1 kernel), so the whole package has a nice round 16K size (4 pages).finarfin wrote:Ah one last thing; the purpose of the new stack allocated, why is needed? And what about the size is there any convention?