Page 5 of 7

Re: Partially written string literals

Posted: Thu Nov 12, 2015 4:05 am
by Brendan
Hi,
intx13 wrote:Here's the simplest I could reduce the problem to: http://pastebin.com/eS5bFdiv

Adjust the number of nops to effect the problem. No nops and it works fine. 4 nops and it prints out two exclamation points and hangs.
15 nops and it doesn't print anything. There's another number that will make it print repeatedly for maybe 10 seconds, then start printing a different character and then hang, but I forget how many.

This code doesn't use your print function, doesn't do anything but call int 0x13 repeatedly, so it would seem that BIOS is touching memory it shouldn't.
How do you install this where? How is it started by what? How is it assembled and linked?

Can you read it direct from disk (after trying to boot it) and disassemble it to verify that it wasn't corrupted on disk before it was booted (and that the tools used to compile it didn't do something "unexpected")?

Also; would you mind disassembling it with something like NDISASM so that I don't have to wonder if "jmp $ORG_SEGMENT, $(asmain)" is "jmp [0x0000:0x7C0A]" and not "jmp 0x0000:0x7C0A"? Note: As far as I can tell this is supposed to be "jmp $ORG_SEGMENT, $asmain" in AT&T syntax.


Cheers,

Brendan

Re: Partially written string literals

Posted: Thu Nov 12, 2015 4:48 am
by Stamerlan
Hi, Brendan,
How do you install this where? How is it started by what? How is it assembled and linked?

Can you read it direct from disk (after trying to boot it) and disassemble it to verify that it wasn't corrupted on disk before it was booted (and that the tools used to compile it didn't do something "unexpected")?

Also; would you mind disassembling it with something like NDISASM so that I don't have to wonder if "jmp $ORG_SEGMENT, $(asmain)" is "jmp [0x0000:0x7C0A]" and not "jmp 0x0000:0x7C0A"? Note: As far as I can tell this is supposed to be "jmp $ORG_SEGMENT, $asmain" in AT&T syntax.
eisdt already answered about compiling and linking, I already disassembled it. esidt already tried to boot from that flash in qemu and everything was fine. That's what I got:

Makefile:

Code: Select all

.PHONY:	all
all:
	as test.S -o test.o 
	ld test.o --oformat binary -Ttext 0x7C00 -o main.img

.PHONY: clean
clean:
	rm -f test.o main.img

.PHONY:	flash
flash:
	sudo dd if=main.img of=/dev/sdb
Disassemble:

Code: Select all

00007C00  FA                cli
00007C01  EA067C0000        jmp word 0x0:0x7c06
00007C06  31C0              xor ax,ax
00007C08  8ED8              mov ds,ax
00007C0A  8EC0              mov es,ax
00007C0C  8ED0              mov ss,ax
00007C0E  BC008E            mov sp,0x8e00
00007C11  B8ADDE            mov ax,0xdead
00007C14  E82100            call word 0x7c38
00007C17  E81800            call word 0x7c32
00007C1A  B8687C            mov ax,0x7c68
00007C1D  E81800            call word 0x7c38
00007C20  B02E              mov al,0x2e
00007C22  E80400            call word 0x7c29
00007C25  FA                cli
00007C26  F4                hlt
00007C27  EBFC              jmp short 0x7c25
00007C29  B40E              mov ah,0xe
00007C2B  B703              mov bh,0x3
00007C2D  60                pushaw
00007C2E  CD10              int 0x10
00007C30  61                popaw
00007C31  C3                ret
00007C32  B05F              mov al,0x5f
00007C34  E8F2FF            call word 0x7c29
00007C37  C3                ret
00007C38  85C0              test ax,ax
00007C3A  7506              jnz 0x7c42
00007C3C  B030              mov al,0x30
00007C3E  E8E8FF            call word 0x7c29
00007C41  C3                ret
00007C42  BB0A00            mov bx,0xa
00007C45  31C9              xor cx,cx
00007C47  31D2              xor dx,dx
00007C49  F7F3              div bx
00007C4B  83C101            add cx,byte +0x1
00007C4E  83C230            add dx,byte +0x30
00007C51  52                push dx
00007C52  85C0              test ax,ax
00007C54  7404              jz 0x7c5a
00007C56  31D2              xor dx,dx
00007C58  EBEF              jmp short 0x7c49
00007C5A  5A                pop dx
00007C5B  88D0              mov al,dl
00007C5D  E8C9FF            call word 0x7c29
00007C60  83E901            sub cx,byte +0x1
00007C63  85C9              test cx,cx
00007C65  75F3              jnz 0x7c5a
00007C67  C3                ret
00007C68  59                pop cx
00007C69  6F                outsw
00007C6A  7520              jnz 0x7c8c
00007C6C  7368              jnc 0x7cd6
00007C6E  61                popaw
00007C6F  6C                insb
00007C70  6C                insb
00007C71  206E6F            and [bp+0x6f],ch
00007C74  7420              jz 0x7c96
00007C76  7365              jnc 0x7cdd
00007C78  65207468          and [gs:si+0x68],dh
00007C7C  6973206D65        imul si,[bp+di+0x20],word 0x656d
00007C81  7373              jnc 0x7cf6
00007C83  61                popaw
00007C84  6765206F6E        and [gs:edi+0x6e],ch
00007C89  207265            and [bp+si+0x65],dh
00007C8C  61                popaw
00007C8D  6C                insb
00007C8E  206861            and [bx+si+0x61],ch
00007C91  7264              jc 0x7cf7
00007C93  7761              ja 0x7cf6
00007C95  7265              jc 0x7cfc
00007C97  2E0000            add [cs:bx+si],al
00007C9A  0000              add [bx+si],al
00007C9C  0000              add [bx+si],al
00007C9E  0000              add [bx+si],al
00007CA0  0000              add [bx+si],al
00007CA2  0000              add [bx+si],al
00007CA4  0000              add [bx+si],al
00007CA6  0000              add [bx+si],al
00007CA8  0000              add [bx+si],al
...
Have a nice day!

Re: Partially written string literals

Posted: Thu Nov 12, 2015 4:57 am
by Brendan
Hi,
Stamerlan wrote:eisdt already answered about compiling and linking, I already disassembled it. esidt already tried to boot from that flash in qemu and everything was fine. That's what I got:
You're testing on USB flash (on real hardware)? Have you verified that the OS and/or firmware doesn't modify/"correct" the BIOS Parameter Block that you don't have?


Cheers,

Brendan

Re: Partially written string literals

Posted: Thu Nov 12, 2015 6:56 am
by eisdt
Hello guys,
intx13 wrote:Here's the simplest I could reduce the problem to: http://pastebin.com/eS5bFdiv

Adjust the number of nops to effect the problem. No nops and it works fine. 4 nops and it prints out two exclamation points and hangs.
15 nops and it doesn't print anything. There's another number that will make it print repeatedly for maybe 10 seconds, then start printing a different character and then hang, but I forget how many.

This code doesn't use your print function, doesn't do anything but call int 0x13 repeatedly, so it would seem that BIOS is touching memory it shouldn't.
How do nops affect the code, I wonder? Moreover, how is it possible that such behavior even exists but, even further, that it's closely undocumented? It's insane!

Re: Partially written string literals

Posted: Thu Nov 12, 2015 7:13 am
by iansjack
What happens if you initialize %ds?

Re: Partially written string literals

Posted: Thu Nov 12, 2015 7:24 am
by eisdt
iansjack wrote:What happens if you initialize %ds?
Can you expand on that? %DS is initialized to 0.

Re: Partially written string literals

Posted: Thu Nov 12, 2015 7:31 am
by intx13
eisdt wrote:
iansjack wrote:What happens if you initialize %ds?
Can you expand on that? %DS is initialized to 0.
Not in my code. But it doesn't make a difference, except to adjust where the rest of the code is located and thus exacerbate the problem.

Re: Partially written string literals

Posted: Thu Nov 12, 2015 7:33 am
by intx13
Brendan wrote:Hi,
Stamerlan wrote:eisdt already answered about compiling and linking, I already disassembled it. esidt already tried to boot from that flash in qemu and everything was fine. That's what I got:
You're testing on USB flash (on real hardware)? Have you verified that the OS and/or firmware doesn't modify/"correct" the BIOS Parameter Block that you don't have?


Cheers,

Brendan
It's definitely not the OS, I read the image back off the drive. It could be BIOS, but that seems like a bug IMO, since there's no partitions or file systems. How would BIOS know which format BPB to "correct"?

Re: Partially written string literals

Posted: Thu Nov 12, 2015 7:35 am
by intx13
eisdt wrote: How do nops affect the code, I wonder? Moreover, how is it possible that such behavior even exists but, even further, that it's closely undocumented? It's insane!
The nops just shift the rest of the code forward in memory, one byte per nop. No nops and the rest of the code is right below the label. 4 nops and its 4 bytes further in memory.

Re: Partially written string literals

Posted: Thu Nov 12, 2015 7:49 am
by Brendan
Hi,
intx13 wrote:It's definitely not the OS, I read the image back off the drive. It could be BIOS, but that seems like a bug IMO, since there's no partitions or file systems. How would BIOS know which format BPB to "correct"?
There's fields that are common to all BPB formats.

I just don't think it's very likely that multiple different computers have the same easily reproduced bug that would effect virtually every OS ever made; and more likely that there's some strange difference that causes problems for your code only (e.g. like not starting with a JMP, not having a BPB or not having a partition table).

Of course I also wonder if these computers have trouble booting other OSs (e.g. FreeDOS, Ubuntu, etc). If all OSs have similar issues it makes it much easier to claim it's firmware bugs. ;)


Cheers,

Brendan

Re: Partially written string literals

Posted: Thu Nov 12, 2015 7:49 am
by eisdt
intx13 wrote: The nops just shift the rest of the code forward in memory, one byte per nop. No nops and the rest of the code is right below the label. 4 nops and its 4 bytes further in memory.
Yep I get that. What I wonder is how this shift affects the execution, should the BIOS be stomping on that region.

You said
Adjust the number of nops to effect the problem. No nops and it works fine. 4 nops and it prints out two exclamation points and hangs.
15 nops and it doesn't print anything. There's another number that will make it print repeatedly for maybe 10 seconds, then start printing a different character and then hang, but I forget how many.
I'd rather expect the opposite, that is, if the BIOS's using 0x7C00, the farther you move away from it (with nops) the "safer" you should be since instructions don't get altered; or did I interpret it wrong? Could NOPs be making the code overflow off the 512 byte region (dunno if it's fine or not)?

Re: Partially written string literals

Posted: Thu Nov 12, 2015 8:27 am
by intx13
eisdt wrote: I'd rather expect the opposite, that is, if the BIOS's using 0x7C00, the farther you move away from it (with nops) the "safer" you should be since instructions don't get altered; or did I interpret it wrong? Could NOPs be making the code overflow off the 512 byte region (dunno if it's fine or not)?
Well we don't know which addresses exactly are being overwritten. If Brendan is right it is 0x7C0B through 0x7C16. It's nowhere close the 512 byte (actually 446, the rest is for the partition table) limit though. Even your version with the string and print function was only ~150 bytes iirc.
Brendan wrote:Hi,
There's fields that are common to all BPB formats.

I just don't think it's very likely that multiple different computers have the same easily reproduced bug that would effect virtually every OS ever made; and more likely that there's some strange difference that causes problems for your code only (e.g. like not starting with a JMP, not having a BPB or not having a partition table).

Of course I also wonder if these computers have trouble booting other OSs (e.g. FreeDOS, Ubuntu, etc). If all OSs have similar issues it makes it much easier to claim it's firmware bugs. ;)
Sure, it just seems weird that BIOS would fix the BPB in memory that one time, rather than fix it on the disk permanently, if it was going to fix it at all. Plus, correct me if I'm wrong, but I thought on hard drives (as opposed to floppies) the BPB is in the first sector of the partition, not the MBR of the drive?

Eh, ok, well just before posting this reply I looked at the stage1 code for GRUB and sure enough the first thing it does is jump over the BPB space. Then it reuses that space for variables later. If in doubt, trust Brendan :)

Re: Partially written string literals

Posted: Thu Nov 12, 2015 8:30 am
by intx13
Whatever decision BIOS makes to "fix" the BPB, it doesn't make it all the time. I have a lot of code that I know works on that particular machine/BIOS that makes no attempt to avoid that space. But that code always has a populated partition table, so maybe BIOS says "oh, no FAT partitions? You must have a BPB in the MBR instead of a partition, let me fix it for you."

Re: Partially written string literals

Posted: Thu Nov 12, 2015 8:35 am
by Stamerlan
Hi,

2esidt May be on that region bios allocates heap or stack, which rewrites ur code, I don't know but it's definitely buggy behavior. Try to move ur code to 64k before invoking any bios calls.

2Brendan
You're testing on USB flash (on real hardware)? Have you verified that the OS and/or firmware doesn't modify/"correct" the BIOS Parameter Block that you don't have?
Yep, after writing I read it with dd and checked md5 sum - all fine.
I tested it with usb flash, but on my hw I couldn't repeat this bug.

2intx13 Did u find any docs where this bios behavior is described? On my machine bios don't touch anything in memory and code executed well.

Have a nice day!

Re: Partially written string literals

Posted: Thu Nov 12, 2015 8:46 am
by intx13
Stamerlan wrote: 2intx13 Did u find any docs where this bios behavior is described? On my machine bios don't touch anything in memory and code executed well.
If I get a chance later I'll write something that will read the MBR out of memory and write it back to the drive and we can see exactly what's happening.