QEMU 0.12.3 suddenly executes random code
Posted: Sun Nov 11, 2012 5:59 pm
Hi,
first of I'm not new to OS development. Though I guess it's never too late to get stuck on a project.
I'm trying to get a basic file system working and for this would like to use the BIOS INT 0x13 functionality. But this is only available in real mode. Thus I wrote a little routine that drops to real mode in order to write the file system code in C then call a write_sector_with_bios() function ... you get the idea.
The code is working fine on hardware (Asus Eee PC 1000H Netbook) but doesn't in QEMU 0.12.3. This makes me sad.
So my question would you be so kind to look over my code and share your wisdom of whether this is a potential QEMU 0.12.3. issue or a bug in my code.
Technical details are contained in the files below:
Program versions: see Makefile
Detailed problem statement: see start.S
What I want to do is write an 'A' on screen via BIOS INT 0x10.
What happens is basically QEMU 0.12.3 is getting out of control and starts to execute random code at a random address. In version 1.2.0 it prints the A on screen just fine as on hardware, though.
Anyway I'd rather like to know whether its QEMU or me before building upon broken code.
start.S:
Other files needed for a full working example of the problem are as follows.
cpy_real_code.c:
linker.ld:
Makefile:
Then I also noticed that the objdump especially of the far jmps looks weird ... though I've seen the problem before and it seems to be bug in objdump. But in case it is not here is the objdump output.
$ objdump -D system.elf:
Thank you for your attention. I appreciate you took your time reading this.
P.S. I very recently converted from NASM (Intel-Syntax) to GAS (AT&T-Syntax) ... in case anyone is wondering about the weird/bad code constructs (esp. jmps and data declarations). Any hints and tips are appreciated.
EDIT:
I hereby release everything above under the following license:
first of I'm not new to OS development. Though I guess it's never too late to get stuck on a project.
I'm trying to get a basic file system working and for this would like to use the BIOS INT 0x13 functionality. But this is only available in real mode. Thus I wrote a little routine that drops to real mode in order to write the file system code in C then call a write_sector_with_bios() function ... you get the idea.
The code is working fine on hardware (Asus Eee PC 1000H Netbook) but doesn't in QEMU 0.12.3. This makes me sad.
So my question would you be so kind to look over my code and share your wisdom of whether this is a potential QEMU 0.12.3. issue or a bug in my code.
Technical details are contained in the files below:
Program versions: see Makefile
Detailed problem statement: see start.S
What I want to do is write an 'A' on screen via BIOS INT 0x10.
What happens is basically QEMU 0.12.3 is getting out of control and starts to execute random code at a random address. In version 1.2.0 it prints the A on screen just fine as on hardware, though.
Anyway I'd rather like to know whether its QEMU or me before building upon broken code.
start.S:
Code: Select all
#define MULTIBOOT_HEADER_MAGIC 0x1BADB002
#define MULTIBOOT_PAGE_ALIGN 0x00000001
#define MULTIBOOT_HEADER_FLAGS MULTIBOOT_PAGE_ALIGN
#define STK_SZ 0x800000 /* 8MiB */
#define REAL_ADDR 0x0500
#define REAL_STK 0x1000
.section .bss
.comm stk, STK_SZ
.section .text
.global start, _start
.extern cpy_real_code
start:
_start:
cli
jmp boot
.align 4
.long MULTIBOOT_HEADER_MAGIC
.long MULTIBOOT_HEADER_FLAGS
.long -(MULTIBOOT_HEADER_MAGIC + MULTIBOOT_HEADER_FLAGS)
boot:
movl $(stk + STK_SZ), %esp
pushl $0
popf
cld
call cpy_real_code
jmp drop_to_real_mode
halt:
hlt
jmp halt
drop_to_real_mode:
cli
/* load a gtd with 16bit code and data segments */
lgdtl gdtd
/* first init 32bit data segment and load 32bit code segment */
mov $0x20, %ax
mov %eax, %ss
mov %eax, %ds
mov %eax, %es
mov %eax, %fs
mov %eax, %gs
ljmpl $0x18, $reload_cs
reload_cs:
/* jmp to code < 1 MiB */
/* FIXME: make this a /direct/ absolute far jmp ... but AT&T syntax is weird :/ */
mov $REAL_ADDR, %eax
jmp *%eax
.section .real
real_start:
/* 9.9.2 Switching Back to Real-Address Mode */
/* 9.9.2:1. Disable interrupts. */
cli
/* 9.9.2:2. If paging is enabled, perform the following operations:
* - Transfer program control to linear addresses that are identity
* mapped to physical addresses (that is, linear addresses equal
* physical addresses).
* - Insure that the GDT and IDT are in identity mapped pages.
* - Clear the PG bit in the CR0 register.
* - Move 0H into the CR3 register to flush the TLB.
*/
// NOTE: nop because paging not enabled
/* 9.9.2:3. Transfer program control to a readable segment that has a
* limit of 64 KBytes (FFFFH). This operation loads the
* CS register with the segment limit required in real-address mode.
*/
ljmp $0x8, $(REAL_ADDR + L3 - real_start)
L3:
.code16
/*
* 9.9.2:4. Load segment registers SS, DS, ES, FS, and GS with a
* selector for a descriptor containing the following values,
* which are appropriate for real-address mode:
* - Limit = 64 KBytes (0FFFFH)
* - Byte granular (G = 0)
* - Expand up (E = 0)
* - Writable (W = 1)
* - Present (P = 1)
* - Base = any value
* The segment registers must be loaded with non-null segment
* selectors or the segment registers will be unusable in
* real-address mode. Note that if the segment registers are not
* reloaded, execution continues using the descriptor attributes
* loaded during protected mode.
*/
mov $0x10, %ax
mov %eax, %ss
mov %eax, %ds
mov %eax, %es
mov %eax, %fs
mov %eax, %gs
/* 9.9.2:5. Execute an LIDT instruction to point to a real-address
* mode interrupt table that is within the 1-MByte real-address
* mode address range.
*/
// NOTE: not neccessary because we never set a idt
// NOTE: IVT already is base=0 limit=0x3ff
// NOTE: setting IVT/IDT again to base=0 limmt=0x3ff here doesn't make any difference
/* 9.9.2:6. Clear the PE flag in the CR0 register to switch to
* real-address mode.
*/
mov %cr0, %eax
and $~0x1, %eax
mov %eax, %cr0
/* 9.9.2:7. Execute a far JMP instruction to jump to a real-address
* mode program. This operation flushes the instruction queue
* and loads the appropriate base-address value in the CS register.
*/
ljmpw $0, $(REAL_ADDR + L7 - real_start)
L7:
/* 9.9.2:8. Load the SS, DS, ES, FS, and GS registers as needed by the
* real-address mode code. If any of the registers are not going
* to be used in real-address mode, write 0s to them.
*/
mov $0, %ax
mov %eax, %ds
mov %eax, %ss
mov %eax, %fs
mov %eax, %gs
mov %eax, %es
mov $(REAL_STK), %sp
mov $0x0, %bh
mov $0x07, %bl
mov $'A', %al
mov $0x0e, %ah
/**************************
* PROBLEM MANIFESTS HERE *
**************************/
/* FIXME: if I hlt here everything is OK. But as soon as I allow
* interrupts execution jumps to a random address and QEMU eventually
* crashes with "execution outside RAM at 0x0000a0000" and/or
* runs until %ip hits 4GiB
*/
//hlt
sti
int $0x10
/* this hlt never gets reached :( */
stop:
cli
hlt
jmp stop
.align 4
idtd:
.word 0x3ff
.long 0
.align 4
gdt:
// null
.byte 0, 0, 0, 0, 0, 0, 0, 0
// real mode
// code 0x08
.byte 0xff, 0xff, 0, 0, 0, 0x9a, 0, 0x00
// data 0x10
.byte 0xff, 0xff, 0, 0, 0, 0x92, 0, 0x00
// protected mode
// code 0x18
.byte 0xff, 0xff, 0, 0, 0, 0x9a, 0xcf, 0x00
// data 0x20
.byte 0xff, 0xff, 0, 0, 0, 0x92, 0xcf, 0x00
gdt_end:
.align 4
gdtd:
.word gdt_end - gdt - 1
.long REAL_ADDR + gdt - real_start
Other files needed for a full working example of the problem are as follows.
cpy_real_code.c:
Code: Select all
extern char REAL_START, REAL_END;
void cpy_real_code(void)
{
volatile char *d = 0x500;
const char *s = &REAL_START;
unsigned n = &REAL_END - &REAL_START;
while(n--)
d[n] = s[n];
}
Code: Select all
ENTRY(start)
SECTIONS
{
. = 0x100000; /* 1MiB */
.text ALIGN(0x1000) :
{
*(.text)
*(.rodata)
. = ALIGN(4096);
}
.rodata ALIGN(0x1000) :
{
*(.rodata*)
. = ALIGN(4096);
}
.data ALIGN(0x1000) :
{
*(.data)
. = ALIGN(4096);
}
/* FIXME: without the ALIGN directives .bss overlaps .real */
.real ALIGN(0x1000) :
{
/* real mode code, which must be relocated when loaded with grub */
REAL_START = .;
*(".real")
*(".real$")
REAL_END = .;
. = ALIGN(4096);
}
.bss ALIGN(0x1000) :
{
*(".bss")
. = ALIGN(4096);
}
}
Makefile:
Code: Select all
# GNU Make 3.81
#
# Setting the tools and their flags.
#
# gcc (Ubuntu 4.4.3-4ubuntu5.1) 4.4.3
CC = gcc
# gcc (Ubuntu 4.4.3-4ubuntu5.1) 4.4.3
AS = gcc
# GNU ld (GNU Binutils for Ubuntu) 2.22
LD = ld
# QEMU PC emulator version 0.12.3 (qemu-kvm-0.12.3), Copyright (c) 2003-2008 Fabrice Bellard
# QEMU emulator version 1.2.0, Copyright (c) 2003-2008 Fabrice Bellard
QEMU = qemu
VERBOSE = @
WFLAGS = -W -Wpadded -Winline -Wstrict-overflow -Wundef -Wwrite-strings \
# -Wall -Wextra -Wdisabled-optimization -Wstrict-aliasing -Wconversion -Wcast-qual \
# -Werror
DFLAGS = -g3
OFLAGS = -O3 -s
CFLAGS = -std=gnu99 -pedantic -nostdlib -fno-builtin -nostartfiles \
-nodefaultlibs $(OFLAGS) $(WFLAGS) $(DFLAGS)
LDFLAGS = -s
ASFLAGS = -nostdlib -fno-builtin -nostartfiles -nodefaultlibs
SRC = $(shell find . -name "*.c" -or -name "*.S" -and -not -name "start.S")
DEP = $(SRC) $(shell find . -name "*.h")
FIRST_OBJ = start.o
OBJ = $(addsuffix .o, $(notdir $(basename $(SRC))))
VPATH = $(dir $(SRC))
BIN = system.elf
all: $(BIN)
system.elf: $(FIRST_OBJ) $(OBJ)
@echo "Linking" $@ "from" $^
$(VERBOSE) $(LD) $(LDGLAGS) -T linker.ld $^ -o system.elf
@echo ""
.PHONY: qemu
qemu: system.elf
$(QEMU) -kernel system.elf
%.o: %.c
@echo "Compiling" $<
$(VERBOSE) $(CC) $(CFLAGS) -c $<
@echo ""
%.o: %.S
@echo "Compiling" $<
$(VERBOSE) $(AS) $(ASFLAGS) -c $< -o $@
@echo ""
.PHONY: clean
clean:
@echo "Deleting files"
$(VERBOSE) $(RM) $(FIRST_OBJ) $(OBJ) $(BIN)
@echo ""
#
# generate the make dependencies
#
Makefile: $(DEP)
@echo "Generating dependencies and updating Makefile"
@sed '/[#] 9445baa814592c63c617be9eb40a39cee949719a/q' < Makefile > depend
@echo "# Make may overwrite this line and everything below." >> depend
$(VERBOSE) $(CC) $(CFLAGS) -MM $(SRC) >> depend
@mv depend Makefile
@echo ""
# Do not delete the next line, stupid! Our depend gen hack depends on it.
# 9445baa814592c63c617be9eb40a39cee949719a
# Make may overwrite this line and everything below.
cpy_real_code.o: cpy_real_code.c
Then I also noticed that the objdump especially of the far jmps looks weird ... though I've seen the problem before and it seems to be bug in objdump. But in case it is not here is the objdump output.
$ objdump -D system.elf:
Code: Select all
system.elf: file format elf32-i386
Disassembly of section .text:
00100000 <_start>:
100000: fa cli
100001: eb 0d jmp 100010 <_start+0x10>
100003: 90 nop
100004: 02 b0 ad 1b 01 00 add 0x11bad(%eax),%dh
10000a: 00 00 add %al,(%eax)
10000c: fd std
10000d: 4f dec %edi
10000e: 52 push %edx
10000f: e4 bc in $0xbc,%al
100011: 00 20 add %ah,(%eax)
100013: 90 nop
100014: 00 6a 00 add %ch,0x0(%edx)
100017: 9d popf
100018: fc cld
100019: e8 32 00 00 00 call 100050 <cpy_real_code>
10001e: eb 03 jmp 100023 <_start+0x23>
100020: f4 hlt
100021: eb fd jmp 100020 <_start+0x20>
100023: fa cli
100024: 0f 01 15 74 10 10 00 lgdtl 0x101074
10002b: 66 b8 20 00 mov $0x20,%ax
10002f: 8e d0 mov %eax,%ss
100031: 8e d8 mov %eax,%ds
100033: 8e c0 mov %eax,%es
100035: 8e e0 mov %eax,%fs
100037: 8e e8 mov %eax,%gs
100039: ea 40 00 10 00 18 00 ljmp $0x18,$0x100040
100040: b8 00 05 00 00 mov $0x500,%eax
100045: ff e0 jmp *%eax
...
00100050 <cpy_real_code>:
100050: b8 7a 10 10 00 mov $0x10107a,%eax
100055: 55 push %ebp
100056: 2d 00 10 10 00 sub $0x101000,%eax
10005b: 89 e5 mov %esp,%ebp
10005d: 74 1b je 10007a <cpy_real_code+0x2a>
10005f: 8d 90 00 10 10 00 lea 0x101000(%eax),%edx
100065: 8d 76 00 lea 0x0(%esi),%esi
100068: 0f b6 4a ff movzbl -0x1(%edx),%ecx
10006c: 83 ea 01 sub $0x1,%edx
10006f: 88 88 ff 04 00 00 mov %cl,0x4ff(%eax)
100075: 83 e8 01 sub $0x1,%eax
100078: 75 ee jne 100068 <cpy_real_code+0x18>
10007a: 5d pop %ebp
10007b: c3 ret
...
Disassembly of section .real:
00101000 <REAL_START>:
101000: fa cli
101001: ea 08 05 00 00 08 00 ljmp $0x8,$0x508
101008: b8 10 00 8e d0 mov $0xd08e0010,%eax
10100d: 8e d8 mov %eax,%ds
10100f: 8e c0 mov %eax,%es
101011: 8e e0 mov %eax,%fs
101013: 8e e8 mov %eax,%gs
101015: 0f 20 c0 mov %cr0,%eax
101018: 66 83 e0 fe and $0xfffe,%ax
10101c: 0f 22 c0 mov %eax,%cr0
10101f: ea 24 05 00 00 b8 00 ljmp $0xb8,$0x524
101026: 00 8e d8 8e d0 8e add %cl,-0x712f7128(%esi)
10102c: e0 8e loopne 100fbc <cpy_real_code+0xf6c>
10102e: e8 8e c0 bc 00 call ccd0c1 <stk+0xbcb0c1>
101033: 10 b7 00 b3 07 b0 adc %dh,-0x4ff84d00(%edi)
101039: 41 inc %ecx
10103a: b4 0e mov $0xe,%ah
10103c: fb sti
10103d: f4 hlt
10103e: cd 10 int $0x10
101040: f4 hlt
101041: 00 00 add %al,(%eax)
101043: 00 ff add %bh,%bh
101045: 03 00 add (%eax),%eax
...
101053: 00 ff add %bh,%bh
101055: ff 00 incl (%eax)
101057: 00 00 add %al,(%eax)
101059: 9a 00 00 ff ff 00 00 lcall $0x0,$0xffff0000
101060: 00 92 00 00 ff ff add %dl,-0x10000(%edx)
101066: 00 00 add %al,(%eax)
101068: 00 9a cf 00 ff ff add %bl,-0xff31(%edx)
10106e: 00 00 add %al,(%eax)
101070: 00 92 cf 00 27 00 add %dl,0x2700cf(%edx)
101076: 4c dec %esp
101077: 05 00 00 00 00 add $0x0,%eax
0010107a <REAL_END>:
...
Disassembly of section .bss:
00102000 <stk>:
...
Disassembly of section .comment:
P.S. I very recently converted from NASM (Intel-Syntax) to GAS (AT&T-Syntax) ... in case anyone is wondering about the weird/bad code constructs (esp. jmps and data declarations). Any hints and tips are appreciated.
EDIT:
I hereby release everything above under the following license:
Code: Select all
Permission to use, copy, modify, and/or distribute this software for any
purpose with or without fee is hereby granted.
THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
MERCHANTABILITY, FITNESS AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL
DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR
PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS
ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
THIS SOFTWARE.