I'm currently working on getting paging to work in my kernel. I've managed to simply identity map the kernel. Now I need to page the kernel into upper memory.
In order for this to work i need set the base address of the kernel sections that will be executed in the high virtual memory addresses in my linker script to be a high address so all the labels and data addresses match up to where it's going to be put in memory. Unfortunatly the kernel initialisation code that's linked at the beginning needs to have its base address to be low when its setting up the MMU. So with both the low and high sections in the linker script, what ends up happening is it compiles to a small 20kb .elf file but when i objdump to get the binary image the raspberry pi firmware can boot, it ends up being a 2GB image file.
Am i missing something obvious here? From what i can tell there are 3 solutions:
1. Use position independant code. With a potential performance penalty. Is the data also position independant? in that it's accessed relative to where the code that accesses it is being executed?
2. Split the kernel into the initialisation code and the highly paged kernel image. With 2 separate linker scripts with differing start base addresses (the initial base address doesn't seem to increase the file size). This means the initialisation code has to actually load the main kernel (should be fairly easy with the ramfs firmware option).
3. Write a small elf-loader binary image with my to parse the kernel.elf. There's no relocation so this should be a rather straight forward, if annoying job. Also something i will have to do anyway to load programs later on.
Any other options? If not, what would be the most elegant?
Raspberry pi, do i need to write an elf-loader?
-
- Posts: 11
- Joined: Thu Aug 01, 2013 9:47 am
Re: Raspberry pi, do i need to write an elf-loader?
You can let the initialization code be position independent, and then ensure that this code is as small as possible. The code can then be linked at the virtual address. This is how the Linux kernel loads, you can actually see a good example in the Linux kernel in head.S and head-common.S under arch/arm/kernel.
If only the very early initialization code needs to be position independent, then the performance overhead of this will be minor.
If only the very early initialization code needs to be position independent, then the performance overhead of this will be minor.
-
- Posts: 11
- Joined: Thu Aug 01, 2013 9:47 am
Re: Raspberry pi, do i need to write an elf-loader?
I didn't even think of that. I had a feeling that i might be over-engineering the problem. Cheers.JacobL wrote:You can let the initialization code be position independent, and then ensure that this code is as small as possible. The code can then be linked at the virtual address. This is how the Linux kernel loads, you can actually see a good example in the Linux kernel in head.S and head-common.S under arch/arm/kernel.
If only the very early initialization code needs to be position independent, then the performance overhead of this will be minor.
Re: Raspberry pi, do i need to write an elf-loader?
I can tell you want I do, maybe it can inspire you a little bit:
My boot sector first load a loader(BIN) into memory.
My loader then load the kernel(BIN) into the high memory position(0xC0000000), entering protected mode, and jump to kernel.
My kernel is a BIN file. I rewrite the linker script in order to remove unnecessary space waste for align. And when I am writing kernel, I tried to avoid code which will generate big BSS data. I found that the BIN file after this kind of processing is much more smaller than the elf file.
My boot sector first load a loader(BIN) into memory.
My loader then load the kernel(BIN) into the high memory position(0xC0000000), entering protected mode, and jump to kernel.
My kernel is a BIN file. I rewrite the linker script in order to remove unnecessary space waste for align. And when I am writing kernel, I tried to avoid code which will generate big BSS data. I found that the BIN file after this kind of processing is much more smaller than the elf file.
Re: Raspberry pi, do i need to write an elf-loader?
I am writing a rpi os too.
I've managed to get higher half working by creating a simple bootloader. The bootloader code is linked at start position (0x8000 on rpi, however for faster development I am using qemu that uses 0x10000 as start position...) and all other kernel code is linked at 0xC0020000 but physically mapped at 0x20000. Well, here is my code for linker script and bootloader, sorry for portuguese comments :S
PS: Note that bootloader code is alone in init.S file.
My bootloader maps from 0-HIGH_MEM (well, this is amount of memory that board has...) on 0xC0000000 (PAGE_OFFSET) - (0xC0000000+HIGH_MEM) with 1MB section. I map first 1MB with identity too. Well, this bootloader can copy kernel code to 0x20000 (TEXT_OFFSET) before mapping pages...
That's it, sorry for my english.
I've managed to get higher half working by creating a simple bootloader. The bootloader code is linked at start position (0x8000 on rpi, however for faster development I am using qemu that uses 0x10000 as start position...) and all other kernel code is linked at 0xC0020000 but physically mapped at 0x20000. Well, here is my code for linker script and bootloader, sorry for portuguese comments :S
PS: Note that bootloader code is alone in init.S file.
Code: Select all
OUTPUT_ARCH(arm)
ENTRY(pre_init)
SECTIONS {
/*
* O código que carrega nosso kernel no endereço correto,
* ele precisa estar linkado no endereço físico.
*/
.stext 0x10000 : AT(0x10000) {
obj/init.o
}
/*
* Nosso diretório de páginas do kernel, tem que
* ser alinhado nos 16KB
*/
. = ALIGN(4096 * 4);
PROVIDE(k_pgdir = .);
. = ALIGN(4096 * 4);
PROVIDE(k_pgdir_end = .);
/*
* Nosso kernel é mapeado em 0xC0000000, no entanto como
* esse endereço é o mapeamento linear da memória física,
* o endereço físico do kernel tem que ser o mesmo.
*/
. = 0xC0020000;
.text : AT(0x20000) {
PROVIDE(k_stack_svc = .);
PROVIDE(k_reloc_start = (. - 0xC0000000));
obj/start.o
*(.text)
*(.text.*)
}
. = ALIGN(4096);
.data : {
*(.data)
*(.data.*)
}
. = ALIGN(4096);
.rodata : {
*(.rodata)
*(.rodata.*)
}
. = ALIGN(4096);
.bss : {
*(.bss)
}
. = ALIGN(4096);
/* Vamos criar as stacks para outros modos, todos com 4kb */
. = . + 4096;
PROVIDE(k_stack_irq = .);
. = . + 4096;
PROVIDE(k_stack_abt = .);
. = . + 4096;
PROVIDE(k_stack_und = .);
PROVIDE(k_reloc_end = . - 0xC0000000);
/DISCARD/ : {
*(.comment*)
}
}
Code: Select all
/*
* FOFOLITO - Sistema Operacional para RaspberryPi
*
* Esse módulo é encarregado de colocar o kernel executando em 0xC0000000
* O kernel para ser mapeado diretamente é conveniente utilizar o tipo de
* página chamado de section, que mapeia 1MB completo, sem necessidade de
* um segundo descritor.
*
* Marcos Medeiros
*/
#include <asm/asm.h>
.section .loader
.global pre_init
pre_init:
/* Primeiro, desabilitamos todas as interrupções */
mrs r0, cpsr
orr r0, #(CPSR_IRQ_DISABLE | CPSR_FIQ_DISABLE)
msr cpsr, r0
/* Configuramos uma stack temporária */
ldr sp, =k_tmp_stack
/*
* Primeiro vamos limpar o diretório de páginas, temos 4096 entradas
* de 4 bytes cada uma.
*/
/* Zeramos os registradores */
mov r2, #0
mov r3, #0
mov r4, #0
mov r5, #0
mov r6, #0
mov r7, #0
mov r8, #0
mov r9, #0
ldr r0, =k_pgdir
mov r10, r0
/* Limpa 64 entradas por loop */
ldr r1, =(4096 / (8 * 8))
clear_pgdir$:
stmia r10!, {r2-r9}
stmia r10!, {r2-r9}
stmia r10!, {r2-r9}
stmia r10!, {r2-r9}
stmia r10!, {r2-r9}
stmia r10!, {r2-r9}
stmia r10!, {r2-r9}
stmia r10!, {r2-r9}
subs r1, r1, #1
bne clear_pgdir$
/* Copia o kernel para a TEXT_OFFSET (0x20000) */
ldr r0, =TEXT_OFFSET
ldr r1, =k_reloc_start
/* Só vamos copiar se o kernel não estiver no local correto */
cmp r0, r1
ldrne r2, =k_reloc_end
subne r2, r2, r1
blne early_memcpy
/* Mapeia as páginas do kernel
* r0 = pgdir
* r1 = entrada na pgdir para o PAGE_OFFSET
* r2 = numeros de entradas para mapear HIGH_MEM,
* Cada entrada mapeia 1MB de memória.
* r3 = enderço física da seção a mapear (1MB cada)
* r4 = atributos da entrada do kernel do pgdir
* r7 = tamanho de uma seção (1MB)
*/
ldr r0, =k_pgdir
ldr r1, =(PAGE_OFFSET >> PGT_SHIFT)
ldr r2, =(HIGH_MEM >> PGT_SHIFT)
mov r3, #0
ldr r4, =K_PGT_SECTION
ldr r7, =(1024 * 1024)
/* O primeiro MB é mapeado em identidade */
str r4, [r0]
/* Agora mapeamos de PAGE_OFFSET :: (PAGE_OFFSET + HIGH_MEM) */
mov r6, #0
/* Coloca em r0 o ponteiro para a primeira entrada de PAGE_OFFSET */
add r0, r0, r1, lsl #2
map_next$:
/* Atributos + Endereço físico */
add r5, r4, r3
str r5, [r0], #4
/* Incrementa o endereço físico */
add r3, r3, r7
/* Incrementa o número de páginas mapeadas */
add r6, r6, #1
cmp r6, r2
blo map_next$
/* Antes de habilitar a MMU, vamos configurar o domínio */
ldr r0, =0x55555555
mcr p15, 0, r0, c3, c0, 0
/* Vamos setar o endereço da ttbr */
ldr r0, =k_pgdir
mcr p15, 0, r0, c2, c0, 0
/* Vamos dizer a MMU, para usar apenas o TTBR0 */
mov r0, #0
mcr p15, 0, r0, c2, c0, 2
/* Vamos habilitar a MMU */
ldr r0, =(MMU_ENABLE | MMU_XP | MMU_WRBUF | MMU_ALIGN)
mcr p15, 0, r0, c1, c0, 0
/* Libera todos os caches */
mov r0, #0
mcr p15, 0, r0, c8, c7, 0
mcr p15, 0, r0, c8, c5, 0
mcr p15, 0, r0, c8, c6, 0
/* Agora estamos prontos para ir ao HigherHalf \o\ */
ldr pc, =boot_start
/*
* Função copiada e minimizada de lib/memory_s.S
* Só copia blocos alinhados em 4bytes
*/
early_memcpy:
stmfd sp!, {r4 - r12}
cmp r2, #64
x64_copy$:
ldmcsia r1!, {r3-r10}
stmcsia r0!, {r3-r10}
ldmcsia r1!, {r3-r10}
stmcsia r0!, {r3-r10}
subcs r2, r2, #64
cmp r2, #64
bcs x64_copy$
cmp r2, #32
x32_copy$:
ldmcsia r1!, {r3-r10}
stmcsia r0!, {r3-r10}
subcs r2, r2, #32
cmp r2, #32
bcs x32_copy$
cmp r2, #16
x16_copy$:
ldmcsia r1!, {r3-r6}
stmcsia r0!, {r3-r6}
subcs r2, r2, #16
cmp r2, #16
bcs x16_copy$
cmp r2, #8
x8_copy$:
ldmcsia r1!, {r3-r4}
stmcsia r0!, {r3-r4}
subcs r2, r2, #8
cmp r2, #8
bcs x8_copy$
cmp r2, #4
x4_copy$:
ldrcs r3, [r1], #4
strcs r3, [r0], #4
subcs r2, r2, #4
cmp r2, #4
bcs x4_copy$
ldmfd sp!, {r4 - r12}
mov pc, lr
/* Criamos uma stack temporária, no máximo 64 entradas */
.align 4; .rept 64; .word 0; .endr; k_tmp_stack:
.ascii "BootLoader HigherHalf"