Fixed: Higher Half in C?

Octacone · Post by **Octacone** » Mon Jul 09, 2018 8:46 am

Octocontrabass wrote: The stack still contains many addresses that still refer to the original identity-mapped portion of memory, including return addresses, so your EIP does not stay in the higher half for long.

If you want to run C(++) code before you've switched to the higher half mapping, you have maybe two choices.

The sensible choice is to have a separate "startup" section linked to execute at its load address, instead of in the higher half. This sort of design is easy to port to other architectures (e.g. 64-bit) later on, but the "startup" section and the higher-half section can't reference each others' symbols. It works something like this:
1. Your bootloader loads the kernel to 1MB.
2. Your kernel's assembly entry point gets executed.
3. It does some stuff and then jumps to C++ code in the startup section.
4. The C++ code in the startup section creates the initial page tables.
5. The C++ code returns to the assembly entry point code.
6. The assembly entry point code enables paging and jumps to the main higher-half C++ code.
7. The identity mapping gets removed.

The ridiculous choice, suggested in that screenshot you posted, is to use segmentation to wrap addresses around the 4GB mark so that virtual address 0xC0000000 maps to physical address 0x00100000 in your segments. This sort of design is specific to 32-bit x86, and it's strange enough that some CPUs might misbehave. It works something like this:
1. Your bootloader loads the kernel to 1MB.
2. Your kernel's assembly entry point gets executed.
3. It prepares a GDT with ridiculous descriptors, and loads the data and stack segments.
4. It performs a far jump to your C++ code, effectively using segmentation to put you in the higher half. (Virtual addresses are in the higher half, but linear addresses are not.)
5. Paging is initialized and enabled.
6. Some external assembly function is used to reload the segments, including CS, with flat segments. (Both virtual and linear addresses are now in the higher half.)
7. The identity mapping gets removed.

simeonz wrote:Linux does something similar to what you describe. There are some details that need to be handled differently:

My bootloader loads my kernel to 1 MB.

My kernel gets executed.

It does some stuff and then calls the lower-half C++ code.

Other systems get initialized.

Paging gets enabled, the kernel gets simultaneously mapped to both 1 MB and 3 GB.

You return to the assembly code.

You jump into the higher-half C++ code.

Identity mapping gets removed.
You need to compile the "higher-half" and "lower-half" parts separately in a different set of object files and split their sections to different output ranges during your final link. They cannot call each other. If you do have common library code, you must either compile it twice to produce non-conflicting symbol names and use weakref aliases in your header files, or build intermediate relocatable file for each set of object files, linking against a static library and use symbol hiding (probably more elegant).

Octacone wrote:Could I fix anything by making my code position independent?
If you want to share the code between the lower-half and upper-half mapping, without separating the object files, you have to make it position-independent indeed. You have to use the -fpie option when compiling and "-shared -Bstatic -Bsymbolic -pie" when linking (may be check this post.) Then, you need to offset the GOT entries before jumping to the higher half mapping. Linux does this, but in a rather convoluted way. There is stub code, which decompresses the actual kernel image. The stub is itself relocatable and the decompressed kernel image is relocated separately. Still, I am going to illustrate how that is done for the stub code, because it is simpler. In the stub's linker script, two markers, namely _got and _egot, are placed around the .got.plt and .got sections (see here). After the stub is moved to its final location, corresponding in your case to the point where the upper half mapping is already established, but before jumping into it, the stub iterates the range between _got and _egot and fixes the entries, adding a rebasing offset (see here). This cannot be considered proper ELF relocation handling, but it works if your output contains only R_386_RELATIVE relocations targeting the GOT. Assuming that you are not importing symbols (which is why you use -pie and not -pic), there either should be no relocations or only this type of relocations. Namely, there shouldn't be any R_386_GLOB_DAT and R_386_JMP_SLOT. You will want to assert that in your build script.

That is brilliant thinking, I completely forgot about the stack values, not just its location.
I am really glad that it can be done.
So here is a question for both of you, how do I actually do something like this, I mean double linking, how can my code be both "lower" and "higher" at the same time?
As far as I can tell you guys have two different approaches:
1.One set of object files, no shared code, different code sections (how?).
2.Two sets of object files, no shared code (without using GOT and complex linking), somehow link it together?
First idea seems more appealing to me. What I am interested in is the technical aspect of this.
How can an executable file (ELF in this case, but doesn't matter) have two different internal locations (1 MB offsets and 3 GB offsets) at the same time?
Is my way of thinking good:
Kernel.asm:

Code: Select all

section .lower_half_text
Pre_Kernel_Main_Low:
...Zero BSS...
...Set up stack pointer... (to offset it by 3 GB or not?)
...Call paging initialization code, ID map the kernel + 3 GB map it...
...return to here...
...call Higher_Half_Main...

section .higher_half_text
Pre_Kernel_Main_High:
...push bootinfo pointer... (my custom multiboot header equivalent, physical address, do I need to offset this by 3 GB?)
...call Kernel_Main... (and remove ID mapping in it)

Linker.ld:

Code: Select all

ENTRY(Pre_Kernel_Main_Low)
OUTPUT_FORMAT("elf32-i386")
SECTIONS 
{
    . = 0x100000;
    .lower_half_text :
    {
        *(.lower_half_text)
    }
    .higher_half_text: AT(ADDR(.higher_half_text) - 0xC0000000)
    {
       *(.higher_half_text)
    }
    //Do I need to touch other sections?
    .rodata :
    {
        start_constructors = .;
        *(SORT(.ctors*))
        end_constructors = .;
        *(.rodata)
    }
    .data :
    {
        *(.data)
    }
    .bss :
    {
        bss_start = .;
        *(.bss)
        *(COMMON)
        bss_end = .;
    }
    kernel_end = .;
}

simeonz · Post by **simeonz** » Mon Jul 09, 2018 3:14 pm

Regarding the linker script, assuming two object files, lower.o and higher.o, I would try something like this:

Code: Select all

SECTIONS
{
    . = 0x100000;
    .l_text :
    {
        lower.o(.text)
    }
    .l_rodata :
    {
        lower.o(.rodata)
    }
    .l_data :
    {
        lower.o(.data)
    }
    .l_bss :
    {
        l_bss_start = .;
        lower.o(.bss)
        lower.o(COMMON)
        l_bss_end = .;
    }
    higher_pstart = ABSOLUTE(ALIGN(CONSTANT (MAXPAGESIZE)));
    higher_vstart = 0xC0000000;
    . = higher_vstart;
    .h_text : AT(higher_pstart) ALIGN(CONSTANT (MAXPAGESIZE))
    {
       higher.o(.text)
    }
    .h_rodata :
    {
        start_constructors = .;
        higher.o(SORT(.ctors*))
        end_constructors = .;
        higher.o(.rodata)
    }
    .h_data : ALIGN(CONSTANT (MAXPAGESIZE))
    {
        higher.o(.data)
    }
    .h_bss :
    {
        h_bss_start = .;
        higher.o(.bss)
        higher.o(COMMON)
        h_bss_end = .;
    }
    higher_vend = ABSOLUTE(.);
    /DISCARD/ : { *(*) }
}

..or like this:

Code: Select all

MEMORY
{
  lower (rwx)  : ORIGIN = 0x100000,   LENGTH = 0x100000
  higher (rwx) : ORIGIN = 0xC0000000, LENGTH = 0x1000000
}

SECTIONS
{
    .l_text :
    {
        lower.o(.text)
    } > lower
    .l_rodata :
    {
        lower.o(.rodata)
    } > lower
    .l_data :
    {
        lower.o(.data)
    } > lower
    .l_bss :
    {
        l_bss_start = .;
        lower.o(.bss)
        lower.o(COMMON)
        l_bss_end = .;
    } > lower
    higher_pstart = ABSOLUTE(ALIGN(CONSTANT (MAXPAGESIZE)));
    higher_vstart = ABSOLUTE(ORIGIN(higher));
    .h_text : AT(higher_pstart) ALIGN(CONSTANT (MAXPAGESIZE))
    {
       higher.o(.text)
    } > higher
    .h_rodata :
    {
        start_constructors = .;
        higher.o(SORT(.ctors*))
        end_constructors = .;
        higher.o(.rodata)
    } > higher
    .h_data : ALIGN(CONSTANT (MAXPAGESIZE))
    {
        higher.o(.data)
    } > higher
    .h_bss :
    {
        h_bss_start = .;
        higher.o(.bss)
        higher.o(COMMON)
        h_bss_end = .;
    } > higher
    higher_vend = ABSOLUTE(.);
}

If you don't have two object files, but one object file with separate sections, or two groups of object files, the difference will be in the section wildcards.

Regarding the Kernel.asm action plan, I would only change "call Higher_Half_Main" with "jmp Higher_Half_Main". Not that it would matter much since you will probably change the stack location soon after. After switching to the higher half, the physical memory between 0x100000 and the higher_pstart marker (if you use my script) can be considered unallocated (i.e. free).

Edit: I used "/DISCARD/ : { *(*) }" in the first script. That is not the most elegant solution, as you may imagine. It is more proper to list specific sections in the "DISCARD" block and use "--orphan-handling=warn" or "--orphan-handling=error". Some example sections that you may encounter. .eh_frame_hdr and .eh_frame store frame records information for the unwinding engine, but unless you are using C++ or unwind through CFI records and not the frame pointer (see fno-omit-frame-pointer), you wont need those and can discard them. gcc_except_table is for C++ catch statements and destructor code at automatic scope, so it is exceptions related. tdata and tbss are for thread local storage. preinit_array, init_array, or fini_array are initialization hooks, which serve as replacements for .init .fini .ctors .dtors. .got. .plt are data and code proxies for PIC builds. The .rel.* sections are immaterial at this stage, unless you are building shared object (i.e., ET_DYN executable) or you are importing dynamic symbols.

Octacone · Post by **Octacone** » Tue Jul 10, 2018 8:36 am

simeonz

This is getting really complicated...
My linker script:

Code: Select all

ENTRY(Pre_Kernel_Main_Lower_Half)
OUTPUT_FORMAT("elf32-i386")
SECTIONS 
{
    . = 0x100000;
    .text_lower_half :
    {
        Objects/Kernel_Assembly.o(.text_lower_half)
        Objects/Higher_Half.o(.text)
    }
    .rodata_lower_half :
    {
        Objects/Higher_Half.o(.rodata)
    }
    .data_lower_half :
    {
        Objects/Higher_Half.o(.data)
    }
    .bss_lower_half :
    {
        bss_start_lower_half = .;
        *(.bss)
        *(COMMON)
        bss_end_lower_half = .;
    }
    higher_physical_start = ABSOLUTE(.);
    . = 0xC0000000;
    .text_higher_half : AT(higher_physical_start)
    {
        Objects/Kernel_Assembly.o(.text_higher_half)
        *(.text)
    }
    .rodata_higher_half :
    {
        start_constructors = .;
        *(SORT(.ctors*))
        end_constructors = .;
        *(.rodata)
    }
    .data_higher_half :
    {
        *(.data)
    }
    .bss_higher_half :
    {
        bss_start_higher_half = .;
        *(.bss)
        *(COMMON)
        bss_end_higher_half = .;
    }
    kernel_end = ABSOLUTE(.);
}

Kernel.asm

Code: Select all

bits 32

section .text_lower_half
extern bss_start_lower_half
extern bss_end_lower_half
global Pre_Kernel_Main_Lower_Half
global Load_Page_Directory_Pointer_Table_Lower_Half
extern Initialize_Higher_Half
Pre_Kernel_Main_Lower_Half:
.Zero_BSS_Lower_Half:
    mov eax, bss_start_lower_half
    mov ecx, bss_end_lower_half
    sub ecx, eax
.Loop:
    cmp ecx, 0
    je .Done
    mov byte [eax + ecx], 0
    dec ecx
    jmp .Loop
.Done:
    mov esp, stack_top
    add dword [bootinfo_pointer], ebx
    call Initialize_Higher_Half
Load_Page_Directory_Pointer_Table_Lower_Half:
    mov eax, [esp + 4]
    mov cr3, eax
    mov eax, cr4
    or eax, 00000000000000000000000000100000b
    mov cr4, eax
    mov eax, cr0
    or eax, 10000000000000000000000000000000b
    mov cr0, eax
    jmp Pre_Kernel_Main_Higher_Half

section .text_higher_half
extern bss_start_higher_half
extern bss_end_higher_half
extern Kernel_Main
Pre_Kernel_Main_Higher_Half:
.Zero_BSS_Higher_Half:
    mov eax, bss_start_higher_half
    mov ecx, bss_end_higher_half
    sub ecx, eax
.Loop:
    cmp ecx, 0
    je .Done
    mov byte [eax + ecx], 0
    dec ecx
    jmp .Loop
.Done:
    mov esp, stack_top
    add esp, 0xC0000000
    push esp
    mov ebx, dword [bootinfo_pointer]
    push ebx
    call Kernel_Main
    jmp $
    
section .bss
stack_bottom:
    resb 4096
stack_top:

section .data
bootinfo_pointer: dd 0xC0000000 ;My multiboot header equivalent

It just triple faults, Bochs says nothing it just locks at inc dword ptr ds:[eax], I can see my page tables just fine. //Note, not a page fault, something else it is
EIP points to a higher half address. ESP doesn't.
I bet my linker script is screwed.

Note: I don't want two sets of object files.
This is how I compile it all together:

Code: Select all

Linker = .../Opt/Cross/bin/i686-elf-g++
Linker_Flags = -T Sources/Linker.ld
Objects = Objects/Kernel_Assembly.o Objects/Kernel_C_Plus_Plus.o Objects/Higher_Half.o...
Final_Object = Objects/Kernel.elf
$(Linker) $(Linker_Flags) -o $(Final_Object) $(Objects) -ffreestanding -O2 -nostdlib -lgcc

simeonz · Post by **simeonz** » Tue Jul 10, 2018 12:04 pm

Octacone wrote:It just triple faults, Bochs says nothing it just locks at inc dword ptr ds:[eax], I can see my page tables just fine. //Note, not a page fault, something else it is
EIP points to a higher half address. ESP doesn't.
I bet my linker script is screwed.

The first thing that strikes me is that you have Higher_Half.o in the lower half.

I will assume it is the initialization code (Initialize_Higher_Half). Also, since you don't align higher_physical_start on a page boundary, it may be causing problems. You include bss and COMMON at your bss_lower_half block, such that all bss and common data get included there. Another problem is that you don't have the data from Kernel_Assembly.o in data_lower_half, so it will be included in the data_higher_half. Anyway. I think you should probably have something like this:

Code: Select all

ENTRY(Pre_Kernel_Main_Lower_Half)
OUTPUT_FORMAT("elf32-i386")
SECTIONS
{
    . = 0x100000;
    .text_lower_half :
    {
        Objects/Kernel_Assembly.o(.text_lower_half)
        Objects/Higher_Half.o(.text)
    }
    .rodata_lower_half :
    {
        Objects/Higher_Half.o(.rodata)
    }
    .data_lower_half :
    {
        Objects/Kernel_Assembly.o(.data)
        Objects/Higher_Half.o(.data)
    }
    .bss_lower_half :
    {
        bss_start_lower_half = .;
        Objects/Kernel_Assembly.o(.bss)
        Objects/Higher_Half.o(.bss)
        Objects/Higher_Half.o(COMMON)
        bss_end_lower_half = .;
    }
    higher_physical_start = ABSOLUTE(ALIGN(0x1000));
    . = 0xC0000000;
    .text_higher_half : AT(higher_physical_start)
    {
        Objects/Kernel_Assembly.o(.text_higher_half)
        *(.text)
    }
    .rodata_higher_half :
    {
        start_constructors = .;
        *(SORT(.ctors*))
        end_constructors = .;
        *(.rodata)
    }
    .data_higher_half :
    {
        *(.data)
    }
    .bss_higher_half :
    {
        bss_start_higher_half = .;
        *(.bss)
        *(COMMON)
        bss_end_higher_half = .;
    }
    kernel_end = ABSOLUTE(.);
}

Octacone wrote:Note: I don't want two sets of object files.

No, don't worry. What you have is perfect.

You do have two sets of object files, except for the launch code. It's ideal.

Octacone · Post by **Octacone** » Wed Jul 11, 2018 5:27 am

simeonz wrote:Observation

Yeah I know.

My naming conventions are very interesting. Your assumption was right. It is easy to get confused, which I do all the time.
I fixed all the mistakes and my kernel grew to 8 MB and cannot be loaded anymore.
It takes like 10 minutes for Bochs to load it and Qemu well doesn't.
Once it is finally loaded, lower half works just fine but then when a higher half jump happens everything locks up and EIP equals 0xC0000001 which is weird sort of.
My OS is a true definition of undefined behavior.

simeonz · Post by **simeonz** » Wed Jul 11, 2018 6:31 am

Octacone wrote:
simeonz wrote:Observation
Yeah I know. My naming conventions are very interesting. Your assumption was right. It is easy to get confused, which I do all the time.
I fixed all the mistakes and my kernel grew to 8 MB and cannot be loaded anymore.
It takes like 10 minutes for Bochs to load it and Qemu well doesn't.
Once it is finally loaded, lower half works just fine but then when a higher half jump happens everything locks up and EIP equals 0xC0000001 which is weird sort of.
My OS is a true definition of undefined behavior.

Could you provide the output of readelf -l, readelf -S, and I would be interested in the value of higher_physical_start, which you could extract from readelf -s. The location of Pre_Kernel_Main_Higher_Half is also interesting, although I am not sure that it will show unless you make it global in the assembly. Or you could try to load the elf with gdb and take the address of Pre_Kernel_Main_Higher_Half.

The point is - lets first check if everything went where it is supposed to go. You could also try to grep objdump -d, to verify that the assembly file is positioned on the appropriate addresses. Lastly, you could check your higher half mapping by connecting gdb to qemu.

Octacone · Post by **Octacone** » Wed Jul 11, 2018 7:45 am

simeonz wrote:Could you provide the output of readelf -l, readelf -S, and I would be interested in the value of higher_physical_start, which you could extract from readelf -s. The location of Pre_Kernel_Main_Higher_Half is also interesting, although I am not sure that it will show unless you make it global in the assembly. Or you could try to load the elf with gdb and take the address of Pre_Kernel_Main_Higher_Half.

The point is - lets first check if everything went where it is supposed to go. You could also try to grep objdump -d, to verify that the assembly file is positioned on the appropriate addresses. Lastly, you could check your higher half mapping by connecting gdb to qemu.

readelf -l: Looks good

Code: Select all

Elf file type is EXEC (Executable file)
Entry point 0x100000
There are 2 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x001000 0x00100000 0x00100000 0x004a5 0x806000 RWE 0x1000
  LOAD           0x002000 0xc0000000 0x00906000 0x80fcf8 0x80fcf8 RWE 0x1000

 Section to Segment mapping:
  Segment Sections...
   00     .text_lower_half .data_lower_half .bss_lower_half 
   01     .text_higher_half .text.startup .rodata.str1.1 .eh_frame .rodata.cst4 .rodata.str1.4 .rodata_higher_half .data_higher_half .bss_higher_half

readelf -s: blank, no data shown

readelf -S: Looks good

Code: Select all

There are 14 section headers, starting at offset 0x811dc4:

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .text_lower_half  PROGBITS        00100000 001000 0004a1 00  AX  0   0 16                                                                                                               
  [ 2] .data_lower_half  PROGBITS        001004a1 0014a1 000004 00  WA  0   0  1                                                                                                               
  [ 3] .bss_lower_half   NOBITS          00101000 0014a5 805000 00  WA  0   0 4096                                                                                                             
  [ 4] .text_higher_half PROGBITS        c0000000 002000 004899 00  AX  0   0 16                                                                                                               
  [ 5] .text.startup     PROGBITS        c00048a0 0068a0 000011 00  AX  0   0 16                                                                                                               
  [ 6] .rodata.str1.1    PROGBITS        c00048b1 0068b1 0003f5 01 AMS  0   0  1                                                                                                               
  [ 7] .eh_frame         PROGBITS        c0004ca8 006ca8 002380 00   A  0   0  4                                                                                                               
  [ 8] .rodata.cst4      PROGBITS        c0007028 009028 000008 04  AM  0   0  4                                                                                                               
  [ 9] .rodata.str1.4    PROGBITS        c0007030 009030 000068 01 AMS  0   0  4                                                                                                               
  [10] .rodata_higher_ha PROGBITS        c0007098 009098 0000bc 00  WA  0   0  4                                                                                                               
  [11] .data_higher_half PROGBITS        c0007160 009160 000080 00  WA  0   0 32                                                                                                               
  [12] .bss_higher_half  PROGBITS        c0008000 00a000 807cf8 00  WA  0   0 4096                                                                                                             
  [13] .shstrtab         STRTAB          00000000 811cf8 0000c9 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  p (processor specific)

grep objdump -d: Looks good, all addresses are OK.

Hold your thoughts, I think I might be onto something thanks to you and readelf.

simeonz · Post by **simeonz** » Wed Jul 11, 2018 9:18 am

It does look good indeed. There are some concerns about the higher kernel's bss section being stored in the file (PROGBITS vs NOBITS), whereas it should have been just logically allocated in the memory layout. Also, I think that some large allocation, probably array of about 8mb size, is duplicated - in the lower and higher space. It may be because it is tentatively defined and becomes a common symbol. That is, if the linker is incapable (for some unknown reason) of tracking the inclusions of the COMMON pseudo-sections in the way it does so for the ordinary sections, the COMMON symbols from Objects/Higher_Half.o may get included in two places.

Pertinent information would be readelf -S and "size --common" on the input files. Also, it may be worth experimenting with the script by making Objects/Kernel_C_Plus_Plus.o explicit in the second portion (where the rules right now use wildcard for the filename).

eryjus · Post by **eryjus** » Wed Jul 11, 2018 11:58 am

[quote="Octacone"

Code: Select all

...
Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x001000 0x00100000 0x00100000 0x004a5 0x806000 RWE 0x1000
  LOAD           0x002000 0xc0000000 0x00906000 0x80fcf8 0x80fcf8 RWE 0x1000

If you don't identify it, please post your linker script. The second Program Header file size is huge. Something in the script is trying to pad the sections in order to (somewhat) align them in physical memory. The results look very familiar and I think I have suffered the same problems.

Octacone · Post by **Octacone** » Wed Jul 11, 2018 12:51 pm

Here it is:

Code: Select all

ENTRY(Pre_Kernel_Main_Lower_Half)
OUTPUT_FORMAT("elf32-i386")
SECTIONS
{
    . = 0x100000;
    .text_lower_half :
    {
        Objects/Kernel_Assembly.o(.text_lower_half)
        Objects/Higher_Half.o(.text)
    }
    .rodata_lower_half :
    {
        Objects/Higher_Half.o(.rodata)
    }
    .data_lower_half :
    {
        Objects/Kernel_Assembly.o(.data_lower_half)
        Objects/Higher_Half.o(.data)
    }
    .bss_lower_half :
    {
        bss_start_lower_half = .;
        Objects/Higher_Half.o(.bss)
        Objects/Higher_Half.o(COMMON)
        bss_end_lower_half = .;
    }
    higher_physical_start = ABSOLUTE(ALIGN(0x1000);
    . = 0xC0000000;
    .text_higher_half : AT(higher_physical_start)
    {
        Objects/Kernel_Assembly.o(.text_higher_half)
        *(.text)
    }
    .rodata_higher_half :
    {
        start_constructors = .;
        *(SORT(.ctors*))
        end_constructors = .;
        *(.rodata)
    }
    .data_higher_half :
    {
        *(.data)
    }
    .bss_higher_half :
    {
        bss_start_higher_half = .;
        Objects/Kernel_Assembly.o(.bss_higher_half)
        *(.bss)
        *(COMMON)
        bss_end_higher_half = .;
    }
    kernel_end = ABSOLUTE(.);
}

Btw 8 MB sounds reasonable because I use PAE which means a lot of space wasted on page tables and what not + a duplicate of that for the lower half.

simeonz · Post by **simeonz** » Thu Jul 12, 2018 4:15 am

Octacone wrote:Btw 8 MB sounds reasonable because I use PAE which means a lot of space wasted on page tables and what not + a duplicate of that for the lower half.

Is it one array or two arrays (8mb x 2)? I am concerned whether the common symbols are included once or twice. Also, try to align the location pointer at the end of the file and see if that works to remove the excessive file size. I will aggregate several change propositions into one:

Code: Select all

ENTRY(Pre_Kernel_Main_Lower_Half)
OUTPUT_FORMAT("elf32-i386")
SECTIONS
{
    . = 0x100000;
    .text_lower_half :
    {
        Objects/Kernel_Assembly.o(.text_lower_half)
        Objects/Higher_Half.o(.text)
    }
    .rodata_lower_half :
    {
        Objects/Higher_Half.o(.rodata)
    }
    .data_lower_half :
    {
        Objects/Kernel_Assembly.o(.data_lower_half)
        Objects/Higher_Half.o(.data)
    }
    .bss_lower_half :
    {
        bss_start_lower_half = .;
        Objects/Higher_Half.o(.bss)
        Objects/Higher_Half.o(COMMON)
        bss_end_lower_half = .;
    }
    higher_physical_start = ABSOLUTE(ALIGN(0x1000);
    . = 0xC0000000;
    .text_higher_half : AT(higher_physical_start)
    {
        Objects/Kernel_Assembly.o(.text_higher_half)
        Objects/Kernel_C_Plus_Plus.o(.text)
    }
    .rodata_higher_half :
    {
        start_constructors = .;
        Objects/Kernel_C_Plus_Plus.o(SORT(.ctors*))
        end_constructors = .;
        Objects/Kernel_C_Plus_Plus.o(.rodata)
    }
    .data_higher_half :
    {
        Objects/Kernel_C_Plus_Plus.o(.data)
    }
    .bss_higher_half :
    {
        bss_start_higher_half = .;
        Objects/Kernel_Assembly.o(.bss_higher_half)
        Objects/Kernel_C_Plus_Plus.o(.bss)
        Objects/Kernel_C_Plus_Plus.o(COMMON)
        bss_end_higher_half = .;
    }
    kernel_end = ABSOLUTE(.);
    . = ALIGN(0x1000);
}

Octacone · Post by **Octacone** » Thu Jul 12, 2018 6:04 am

simeonz wrote:Is it one array or two arrays (8mb x 2)? I am concerned whether the common symbols are included once or twice. Also, try to align the location pointer at the end of the file and see if that works to remove the excessive file size. I will aggregate several change propositions into one:
Code: Select all
linker script...

It is basically this times two: (one for the lower and one for the higher half)

Code: Select all

typedef struct page_t
{
    uint64_t present : 1;
    uint64_t write_enabled : 1;
    uint64_t user_page : 1;
    uint64_t write_through : 1;
    uint64_t cache_disabled : 1;
    uint64_t accessed : 1;
    uint64_t dirty : 1;
    uint64_t pat_supported : 1; 
    uint64_t global : 1;
    uint64_t ignored : 3;
    uint64_t physical_address : 40;
    uint64_t reserved : 11;
    uint64_t execute_disable : 1;
}__attribute__((packed)) page_t;

typedef struct page_table_t
{
    page_t pages[512];
}__attribute__((packed)) __attribute__((aligned(0x1000))) page_table_t;

typedef struct page_directory_entry_t
{
	uint64_t present : 1;
	uint64_t write_enabled : 1;
    uint64_t user_page_table : 1;
    uint64_t write_through : 1;
    uint64_t cache_disabled : 1;
    uint64_t accessed : 1;
    uint64_t ignored_0 : 1;
    uint64_t page_size : 1;
    uint64_t ignored_1 : 4;
    uint64_t page_table_address : 40;
    uint64_t reserved : 11;
    uint64_t execute_disable : 1;
}__attribute__((packed)) page_directory_entry_t;

typedef struct page_directory_t
{
    page_directory_entry_t page_tables[512];
}__attribute__((packed)) __attribute__((aligned(0x1000))) page_directory_t;

typedef struct page_directory_pointer_table_entry_t
{
    uint64_t present : 1;
    uint64_t reserved : 2;
    uint64_t write_through : 1;
    uint64_t cache_disabled : 1;
    uint64_t reserved_0 : 4;
    uint64_t ignored : 3;
    uint64_t page_directory_address : 40;
    uint64_t reserved_1 : 12;
}__attribute__((packed)) page_directory_pointer_table_entry_t;

typedef struct page_directory_pointer_table_t
{
    page_directory_pointer_table_entry_t page_directories[4];
}__attribute__((packed)) __attribute__((aligned(0x20))) page_directory_pointer_table_t;

page_directory_pointer_table_t page_directory_pointer_table;
page_directory_t page_directory[4];
page_table_t page_table[2048];

I don't understand why would I want to include Kernel_C_Plus_Plus.o when it only contains some initialization code.
I don't know if you know but there are also other objects files:
Objects = Objects/Kernel_Assembly.o Objects/Kernel_C_Plus_Plus.o Objects/Higher_Half.o Objects/Hardware.o Objects/Tools.o Objects/VGA.o \
Objects/GDT_C_Plus_Plus.o Objects/GDT_Assembly.o Objects/TSS_C_Plus_Plus.o Objects/TSS_Assembly.o Objects/IDT_C_Plus_Plus.o \
Objects/IDT_Assembly.o Objects/ISR_C_Plus_Plus.o Objects/ISR_Assembly.o Objects/PIC.o Objects/IRQ_C_Plus_Plus.o \
Objects/IRQ_Assembly.o Objects/PIT.o Objects/Memory_Helpers.o Objects/PMM.o Objects/VMM_C_Plus_Plus.o Objects/VMM_Assembly.o \
Objects/LibAlloc.o
As you can see Kernel_C_Plus_PLus.o doesn't contain any of those, it's not a kernel itself if that is what you meant.

simeonz · Post by **simeonz** » Thu Jul 12, 2018 8:14 am

Octacone wrote:As you can see Kernel_C_Plus_PLus.o doesn't contain any of those, it's not a kernel itself if that is what you meant.

Ah, I see. Tripped over the name again. Anyway, if there are two arrays, the only thing that is out of place is the fact that the higher bss is PROGBITS instead of NOBITS (causing zeroed bloat in the elf). But there is nothing wrong that I can see that can cause a crash. As I said before, it is probably possible to remove the bloat by aligning the location counter at the end (since the orphan sections are "usually" included at the end of the file), or you could align the location counter and create a section block for .shstrtab explicitly. In case you wonder why is that related to the bloat, it is because the linker is merging .shstrtab into the last page of .bss_higher_half, which causes the latter to be zero padded.

I understand why 2x8m. I assumed that you modify the page tables in place when you remove the lower kernel mapping. But looking at your cr3 move, it is clear that you keep two versions of the tables and swap them, which explains the two arrays.

Regarding the crash issue, there is nothing I can say at this point. I will check the above fields some time later today, but assuming they are fine, may be the mapping is not setup properly (to state the obvious). It would be nice if readelf -s worked. You could try the --discard-none ld option, and see if readelf -s works after that.

Octacone · Post by **Octacone** » Thu Jul 12, 2018 11:17 am

simeonz wrote:
Octacone wrote:As you can see Kernel_C_Plus_PLus.o doesn't contain any of those, it's not a kernel itself if that is what you meant.
Ah, I see. Tripped over the name again. Anyway, if there are two arrays, the only thing that is out of place is the fact that the higher bss is PROGBITS instead of NOBITS (causing zeroed bloat in the elf). But there is nothing wrong that I can see that can cause a crash. As I said before, it is probably possible to remove the bloat by aligning the location counter at the end (since the orphan sections are "usually" included at the end of the file), or you could align the location counter and create a section block for .shstrtab explicitly. In case you wonder why is that related to the bloat, it is because the linker is merging .shstrtab into the last page of .bss_higher_half, which causes the latter to be zero padded.

I understand why 2x8m. I assumed that you modify the page tables in place when you remove the lower kernel mapping. But looking at your cr3 move, it is clear that you keep two versions of the tables and swap them, which explains the two arrays.

Regarding the crash issue, there is nothing I can say at this point. I will check the above fields some time later today, but assuming they are fine, may be the mapping is not setup properly (to state the obvious). It would be nice if readelf -s worked. You could try the --discard-none ld option, and see if readelf -s works after that.

It didn't fix the problem (aligning). BSS is still getting marked as PROGBITS...
Actually I don't think that a real problem actually exists. My ELF loader is just broken. My original assumption was that there would be only 1 program header, but I was wrong and now I have to rewrite it.
Paging works just fine, I can see everything mapped correctly and it also worked before, I used the same code.

simeonz · Post by **simeonz** » Thu Jul 12, 2018 12:07 pm

Octacone wrote:It didn't fix the problem (aligning). BSS is still getting marked as PROGBITS...

The thing is - I cannot reproduce this on my end, so it is difficult to come up with a solution. But if you have found the crash issue and are willing to live with some extra zeroes.. fine

OSDev.org

Fixed: Higher Half in C?

Re: Higher Half in C?

Re: Higher Half in C?

Re: Higher Half in C?

Re: Higher Half in C?

Re: Higher Half in C?

Re: Higher Half in C?

Re: Higher Half in C?

Re: Higher Half in C?

Re: Higher Half in C?

Re: Higher Half in C?

Re: Higher Half in C?

Re: Higher Half in C?

Re: Higher Half in C?

Re: Higher Half in C?

Re: Higher Half in C?