C-code: weird compilation

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
User avatar
williderwurm
Posts: 13
Joined: Wed Mar 18, 2015 5:28 am
Location: Hyperspace

C-code: weird compilation

Post by williderwurm »

Good Day,

I have tried to compile C code to run in my x86 protected-mode environment.
Because I am using Ubuntu 14 64bit I use the -m32 to compile for a 32bit target.

The c-code:

Code: Select all

int my_function()
{
        return 0xbaba;
}
Commands I use:

gcc -ffreestanding -c kernel.c -o kernel.o -m32

ld kernel.o --oformat binary -Ttext 0x0 -o kernel.bin -melf_i386

The output of 'objdump -d kernel.o' is:

Code: Select all

kernel.o:     file format elf32-i386


Disassembly of section .text:

00000000 <my_function>:
   0:	55                   	push   %ebp
   1:	89 e5                	mov    %esp,%ebp
   3:	b8 ba ba 00 00       	mov    $0xbaba,%eax
   8:	5d                   	pop    %ebp
   9:	c3                   	ret
But the output of 'ndisasm kernel.bin' after having used ld is:

Code: Select all

00000000  55                push bp
00000001  89E5              mov bp,sp
00000003  B8BABA            mov ax,0xbaba
00000006  0000              add [bx+si],al
00000008  5D                pop bp
00000009  C3                ret
0000000A  0000              add [bx+si],al
0000000C  1400              adc al,0x0
0000000E  0000              add [bx+si],al
00000010  0000              add [bx+si],al
00000012  0000              add [bx+si],al
00000014  017A52            add [bp+si+0x52],di
00000017  0001              add [bx+di],al
00000019  7C08              jl 0x23
0000001B  011B              add [bp+di],bx
0000001D  0C04              or al,0x4
0000001F  0488              add al,0x88
00000021  0100              add [bx+si],ax
00000023  001C              add [si],bl
00000025  0000              add [bx+si],al
00000027  001C              add [si],bl
00000029  0000              add [bx+si],al
0000002B  00D4              add ah,dl
0000002D  FF                db 0xff
0000002E  FF                db 0xff
0000002F  FF0A              dec word [bp+si]
00000031  0000              add [bx+si],al
00000033  0000              add [bx+si],al
00000035  41                inc cx
00000036  0E                push cs
00000037  08850242          or [di+0x4202],al
0000003B  0D0546            or ax,0x4605
0000003E  C50C              lds cx,[si]
00000040  0404              add al,0x4
00000042  0000              add [bx+si],al

You can see that the binary file contains a lot of instructions, that should not be there. If it would have compiled as i wanted it should only consist of the first 6 lines...
So i guess the problem is, that I use ld wrong. But what do I have to do to get a correct output?
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: C-code: weird compilation

Post by Brendan »

Hi,
williderwurm wrote:But the output of 'ndisasm kernel.bin' after having used ld is:
You have to tell NDISASM if the code is 16-bit, 32-bit or 64-bit. If you don't it'll assume the code is 16-bit (even if it's not) and you'll get what you got.

Note 1: The CPU is the same - e.g. if your code is 32-bit and nothing tells the CPU to switch to 32-bit then it'll execute your 32-bit code as 16-bit and things will get all funky and broken.

Note 2: Also; once you convert object files to the "flat binary" format it's impossible to tell the difference between code and data. The "lots of instructions that shouldn't be there" is data and not code.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
User avatar
williderwurm
Posts: 13
Joined: Wed Mar 18, 2015 5:28 am
Location: Hyperspace

Re: C-code: weird compilation

Post by williderwurm »

Brendan wrote:Hi,
williderwurm wrote:But the output of 'ndisasm kernel.bin' after having used ld is:
You have to tell NDISASM if the code is 16-bit, 32-bit or 64-bit. If you don't it'll assume the code is 16-bit (even if it's not) and you'll get what you got.

Note 1: The CPU is the same - e.g. if your code is 32-bit and nothing tells the CPU to switch to 32-bit then it'll execute your 32-bit code as 16-bit and things will get all funky and broken.

Note 2: Also; once you convert object files to the "flat binary" format it's impossible to tell the difference between code and data. The "lots of instructions that shouldn't be there" is data and not code.


Cheers,

Brendan
How to remove this data then? If you look at my c code there is no data, just code....
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: C-code: weird compilation

Post by Brendan »

Hi,
williderwurm wrote:How to remove this data then? If you look at my c code there is no data, just code....
I have no idea what the data is or where it came from (maybe just random trash that happened to be in memory when the linker was adding padding?).

You'd have to do some investigation - tweak your linker script, try putting some data in the source code and seeing where it ends up (before or after the garbage), etc.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
User avatar
williderwurm
Posts: 13
Joined: Wed Mar 18, 2015 5:28 am
Location: Hyperspace

Re: C-code: weird compilation

Post by williderwurm »

Ok I guess I have found the cause:
Doing an objdump of all sections (not only the .text one) using the -D argument displays:

Code: Select all

kernel.o:     file format elf32-i386


Disassembly of section .text:

00000000 <my_function>:
   0:	55                   	push   %ebp
   1:	89 e5                	mov    %esp,%ebp
   3:	b8 ba ba 00 00       	mov    $0xbaba,%eax
   8:	5d                   	pop    %ebp
   9:	c3                   	ret    

Disassembly of section .comment:

00000000 <.comment>:
   0:	00 47 43             	add    %al,0x43(%edi)
   3:	43                   	inc    %ebx
   4:	3a 20                	cmp    (%eax),%ah
   6:	28 55 62             	sub    %dl,0x62(%ebp)
   9:	75 6e                	jne    79 <my_function+0x79>
   b:	74 75                	je     82 <my_function+0x82>
   d:	20 34 2e             	and    %dh,(%esi,%ebp,1)
  10:	38 2e                	cmp    %ch,(%esi)
  12:	32 2d 31 39 75 62    	xor    0x62753931,%ch
  18:	75 6e                	jne    88 <my_function+0x88>
  1a:	74 75                	je     91 <my_function+0x91>
  1c:	31 29                	xor    %ebp,(%ecx)
  1e:	20 34 2e             	and    %dh,(%esi,%ebp,1)
  21:	38 2e                	cmp    %ch,(%esi)
  23:	32 00                	xor    (%eax),%al

Disassembly of section .eh_frame:

00000000 <.eh_frame>:
   0:	14 00                	adc    $0x0,%al
   2:	00 00                	add    %al,(%eax)
   4:	00 00                	add    %al,(%eax)
   6:	00 00                	add    %al,(%eax)
   8:	01 7a 52             	add    %edi,0x52(%edx)
   b:	00 01                	add    %al,(%ecx)
   d:	7c 08                	jl     17 <.eh_frame+0x17>
   f:	01 1b                	add    %ebx,(%ebx)
  11:	0c 04                	or     $0x4,%al
  13:	04 88                	add    $0x88,%al
  15:	01 00                	add    %eax,(%eax)
  17:	00 1c 00             	add    %bl,(%eax,%eax,1)
  1a:	00 00                	add    %al,(%eax)
  1c:	1c 00                	sbb    $0x0,%al
  1e:	00 00                	add    %al,(%eax)
  20:	00 00                	add    %al,(%eax)
  22:	00 00                	add    %al,(%eax)
  24:	0a 00                	or     (%eax),%al
  26:	00 00                	add    %al,(%eax)
  28:	00 41 0e             	add    %al,0xe(%ecx)
  2b:	08 85 02 42 0d 05    	or     %al,0x50d4202(%ebp)
  31:	46                   	inc    %esi
  32:	c5 0c 04             	lds    (%esp,%eax,1),%ecx
  35:	04 00                	add    $0x0,%al
	...

Does anybody know how to exclude these extra sections from compilation?
I just want the .text section to be compiled...
User avatar
williderwurm
Posts: 13
Joined: Wed Mar 18, 2015 5:28 am
Location: Hyperspace

Re: C-code: weird compilation

Post by williderwurm »

I solved it:

The gcc compiler adds some useless junk (at least for bare metal it is) to the elf .o file. To remove this "junk" I use "objcopy --remove-section=.<the_section> kernel.o".
Rename <the_section> to the section you want to remove.
This removes the sections I do not want to have. You can find these sections by using "objdump -D kernel.o". This shows all sections gcc has put into the file.
Thanks to Brendan for trying to help me out. :)

[SOLVED]
Techel
Member
Member
Posts: 215
Joined: Fri Jan 30, 2015 4:57 pm
Location: Germany
Contact:

Re: C-code: weird compilation

Post by Techel »

Or you could just discard them in your linker script.
Post Reply