Raspberry pi, do i need to write an elf-loader?

chickendinner · Post by **chickendinner** » Sat Aug 10, 2013 10:24 am

I'm currently working on getting paging to work in my kernel. I've managed to simply identity map the kernel. Now I need to page the kernel into upper memory.

In order for this to work i need set the base address of the kernel sections that will be executed in the high virtual memory addresses in my linker script to be a high address so all the labels and data addresses match up to where it's going to be put in memory. Unfortunatly the kernel initialisation code that's linked at the beginning needs to have its base address to be low when its setting up the MMU. So with both the low and high sections in the linker script, what ends up happening is it compiles to a small 20kb .elf file but when i objdump to get the binary image the raspberry pi firmware can boot, it ends up being a 2GB image file.

Am i missing something obvious here? From what i can tell there are 3 solutions:

1. Use position independant code. With a potential performance penalty. Is the data also position independant? in that it's accessed relative to where the code that accesses it is being executed?

2. Split the kernel into the initialisation code and the highly paged kernel image. With 2 separate linker scripts with differing start base addresses (the initial base address doesn't seem to increase the file size). This means the initialisation code has to actually load the main kernel (should be fairly easy with the ramfs firmware option).

3. Write a small elf-loader binary image with my to parse the kernel.elf. There's no relocation so this should be a rather straight forward, if annoying job. Also something i will have to do anyway to load programs later on.

Any other options? If not, what would be the most elegant?

JacobL · Post by **JacobL** » Sat Aug 10, 2013 11:34 am

You can let the initialization code be position independent, and then ensure that this code is as small as possible. The code can then be linked at the virtual address. This is how the Linux kernel loads, you can actually see a good example in the Linux kernel in head.S and head-common.S under arch/arm/kernel.

If only the very early initialization code needs to be position independent, then the performance overhead of this will be minor.

chickendinner · Post by **chickendinner** » Sat Aug 10, 2013 11:45 am

JacobL wrote:You can let the initialization code be position independent, and then ensure that this code is as small as possible. The code can then be linked at the virtual address. This is how the Linux kernel loads, you can actually see a good example in the Linux kernel in head.S and head-common.S under arch/arm/kernel.

If only the very early initialization code needs to be position independent, then the performance overhead of this will be minor.

I didn't even think of that. I had a feeling that i might be over-engineering the problem. Cheers.

nbdd0121 · Post by **nbdd0121** » Sun Aug 11, 2013 7:32 am

I can tell you want I do, maybe it can inspire you a little bit:
My boot sector first load a loader(BIN) into memory.
My loader then load the kernel(BIN) into the high memory position(0xC0000000), entering protected mode, and jump to kernel.
My kernel is a BIN file. I rewrite the linker script in order to remove unnecessary space waste for align. And when I am writing kernel, I tried to avoid code which will generate big BSS data. I found that the BIN file after this kind of processing is much more smaller than the elf file.

zxm · Post by **zxm** » Sun Aug 11, 2013 8:25 pm

I am writing a rpi os too.
I've managed to get higher half working by creating a simple bootloader. The bootloader code is linked at start position (0x8000 on rpi, however for faster development I am using qemu that uses 0x10000 as start position...) and all other kernel code is linked at 0xC0020000 but physically mapped at 0x20000. Well, here is my code for linker script and bootloader, sorry for portuguese comments :S
PS: Note that bootloader code is alone in init.S file.

Code: Select all

OUTPUT_ARCH(arm)
ENTRY(pre_init)
SECTIONS {

	/*
	 * O código que carrega nosso kernel no endereço correto,
	 * ele precisa estar linkado no endereço físico.
	 */
	.stext 0x10000 : AT(0x10000) {
		obj/init.o
	}
	/*
	 * Nosso diretório de páginas do kernel, tem que
	 * ser alinhado nos 16KB
	 */
	. = ALIGN(4096 * 4);
	PROVIDE(k_pgdir = .);
	. = ALIGN(4096 * 4);
	PROVIDE(k_pgdir_end = .);
	/*
	 * Nosso kernel é mapeado em 0xC0000000, no entanto como
	 * esse endereço é o mapeamento linear da memória física,
	 * o endereço físico do kernel tem que ser o mesmo.
	 */
	. = 0xC0020000;
	.text : AT(0x20000) {
		PROVIDE(k_stack_svc = .);
		PROVIDE(k_reloc_start = (. - 0xC0000000));
		obj/start.o
		*(.text)
		*(.text.*)
	}
	. = ALIGN(4096);

	.data : {
		*(.data)
		*(.data.*)
	}
	. = ALIGN(4096);
	.rodata : {
		*(.rodata)
		*(.rodata.*)
	}
	. = ALIGN(4096);
	.bss : {
		*(.bss)
	}
	. = ALIGN(4096);
	/* Vamos criar as stacks para outros modos, todos com 4kb */
	. = . + 4096;
	PROVIDE(k_stack_irq = .);
	. = . + 4096;
	PROVIDE(k_stack_abt = .);
	. = . + 4096;
	PROVIDE(k_stack_und = .);
	PROVIDE(k_reloc_end = . - 0xC0000000);
	/DISCARD/ : {
		*(.comment*)
	}
}

My bootloader maps from 0-HIGH_MEM (well, this is amount of memory that board has...) on 0xC0000000 (PAGE_OFFSET) - (0xC0000000+HIGH_MEM) with 1MB section. I map first 1MB with identity too. Well, this bootloader can copy kernel code to 0x20000 (TEXT_OFFSET) before mapping pages...

Code: Select all

/*
 * FOFOLITO - Sistema Operacional para RaspberryPi
 *
 * Esse módulo é encarregado de colocar o kernel executando em 0xC0000000
 * O kernel para ser mapeado diretamente é conveniente utilizar o tipo de
 * página chamado de section, que mapeia 1MB completo, sem necessidade de
 * um segundo descritor.
 *
 * Marcos Medeiros
 */
#include <asm/asm.h>

.section .loader

.global pre_init
pre_init:
	/* Primeiro, desabilitamos todas as interrupções */
	mrs		r0, cpsr
	orr		r0, #(CPSR_IRQ_DISABLE | CPSR_FIQ_DISABLE)
	msr		cpsr, r0

	/* Configuramos uma stack temporária */
	ldr		sp, =k_tmp_stack	

	/* 
	 * Primeiro vamos limpar o diretório de páginas, temos 4096 entradas
	 * de 4 bytes cada uma.
	 */
	/* Zeramos os registradores */
	mov		r2, #0	
	mov		r3, #0	
	mov		r4, #0	
	mov		r5, #0	
	mov		r6, #0	
	mov		r7, #0	
	mov		r8, #0	
	mov		r9, #0

	ldr		r0, =k_pgdir
	mov		r10, r0
	/* Limpa 64 entradas por loop */
	ldr		r1, =(4096 / (8 * 8))
clear_pgdir$:
	stmia	r10!, {r2-r9}
	stmia	r10!, {r2-r9}
	stmia	r10!, {r2-r9}
	stmia	r10!, {r2-r9}

	stmia	r10!, {r2-r9}
	stmia	r10!, {r2-r9}
	stmia	r10!, {r2-r9}
	stmia	r10!, {r2-r9}

	subs	r1, r1, #1
	bne		clear_pgdir$

	/* Copia o kernel para a TEXT_OFFSET (0x20000) */
	ldr		r0, =TEXT_OFFSET
	ldr		r1, =k_reloc_start
	/* Só vamos copiar se o kernel não estiver no local correto */
	cmp		r0, r1
	ldrne	r2, =k_reloc_end
	subne	r2, r2, r1
	blne	early_memcpy

	/* Mapeia as páginas do kernel
	 * r0 = pgdir
	 * r1 = entrada na pgdir para o PAGE_OFFSET
	 * r2 = numeros de entradas para mapear HIGH_MEM,
	 *      Cada entrada mapeia 1MB de memória.
	 * r3 = enderço física da seção a mapear (1MB cada)
	 * r4 = atributos da entrada do kernel do pgdir
	 * r7 = tamanho de uma seção (1MB)
	 */
	ldr		r0, =k_pgdir
	ldr		r1, =(PAGE_OFFSET >> PGT_SHIFT)
	ldr		r2, =(HIGH_MEM >> PGT_SHIFT)
	mov		r3, #0
	ldr		r4, =K_PGT_SECTION
	ldr		r7, =(1024 * 1024)

	/* O primeiro MB é mapeado em identidade */
	str		r4, [r0]

	/* Agora mapeamos de PAGE_OFFSET :: (PAGE_OFFSET + HIGH_MEM) */
	mov		r6, #0
	/* Coloca em r0 o ponteiro para a primeira entrada de PAGE_OFFSET */
	add		r0, r0, r1, lsl #2
map_next$:
	/* Atributos + Endereço físico */
	add		r5, r4, r3
	str		r5, [r0], #4
	/* Incrementa o endereço físico */
	add		r3, r3, r7
	/* Incrementa o número de páginas mapeadas */
	add		r6, r6, #1
	cmp		r6, r2
	blo		map_next$

	/* Antes de habilitar a MMU, vamos configurar o domínio */
	ldr		r0, =0x55555555
	mcr		p15, 0, r0, c3, c0, 0
	/* Vamos setar o endereço da ttbr */
	ldr		r0, =k_pgdir
	mcr		p15, 0, r0, c2, c0, 0
	/* Vamos dizer a MMU, para usar apenas o TTBR0 */
	mov		r0, #0
	mcr		p15, 0, r0, c2, c0, 2

	/* Vamos habilitar a MMU */
	ldr		r0, =(MMU_ENABLE | MMU_XP | MMU_WRBUF | MMU_ALIGN)
	mcr		p15, 0, r0, c1, c0, 0

	/* Libera todos os caches */
	mov		r0, #0
	mcr		p15, 0, r0, c8, c7, 0
	mcr		p15, 0, r0, c8, c5, 0
	mcr		p15, 0, r0, c8, c6, 0

	/* Agora estamos prontos para ir ao HigherHalf \o\ */
	ldr		pc, =boot_start
/*
 * Função copiada e minimizada de lib/memory_s.S 
 * Só copia blocos alinhados em 4bytes
 */
early_memcpy:
	stmfd	sp!, {r4 - r12}	
	cmp		r2, #64
x64_copy$:
	ldmcsia	r1!, {r3-r10}
	stmcsia	r0!, {r3-r10}
	ldmcsia	r1!, {r3-r10}
	stmcsia	r0!, {r3-r10}
	subcs	r2, r2, #64
	cmp		r2, #64
	bcs		x64_copy$
	cmp		r2, #32
x32_copy$:
	ldmcsia	r1!, {r3-r10}
	stmcsia	r0!, {r3-r10}
	subcs	r2, r2, #32
	cmp		r2, #32
	bcs		x32_copy$
	cmp		r2, #16
x16_copy$:
	ldmcsia	r1!, {r3-r6}
	stmcsia	r0!, {r3-r6}
	subcs	r2, r2, #16
	cmp		r2, #16
	bcs		x16_copy$
	cmp		r2, #8
x8_copy$:
	ldmcsia	r1!, {r3-r4}
	stmcsia	r0!, {r3-r4}
	subcs	r2, r2, #8
	cmp		r2, #8
	bcs		x8_copy$
	cmp		r2, #4
x4_copy$:
	ldrcs	r3, [r1], #4
	strcs	r3, [r0], #4
	subcs	r2, r2, #4
	cmp		r2, #4
	bcs		x4_copy$
	ldmfd	sp!, {r4 - r12}
	mov		pc, lr


/* Criamos uma stack temporária, no máximo 64 entradas */
.align 4; .rept 64; .word 0; .endr; k_tmp_stack:
.ascii "BootLoader HigherHalf"

That's it, sorry for my english.

OSDev.org

Raspberry pi, do i need to write an elf-loader?

Raspberry pi, do i need to write an elf-loader?

Re: Raspberry pi, do i need to write an elf-loader?

Re: Raspberry pi, do i need to write an elf-loader?

Re: Raspberry pi, do i need to write an elf-loader?

Re: Raspberry pi, do i need to write an elf-loader?