Page 1 of 1

strange problem

Posted: Fri Dec 19, 2003 5:17 am
by shahzad
i'm facing two problems

1)
The size of my kernel was 21.8k ,but when i added just one function ,it increased from 21.8k to 26.1k.
The command i'm using to link asm and c files is

Code: Select all

ld -Ttext=0x1000 --oformat binary -o kernel.bin kernel_asm.o kernel_c.o
Please tell me why size increased so much by adding just one function.
(i think it has something to do with linker script)

2)

After this i add an array of 360 bytes.then read something from floppy everything goes fine.But when i call that fuction which reads from floppy from some other function ,floppy drive motor turns on and then everything hangs.
No interrupt is recieved after that.
When i removed that array of 360 bytes.everything works OK again.
Please tell me what might be causing problem.

Is my code falling in wrong memory location.I'm loading my kernel at address 0000:1000 and size of kernel is between 26 and 27k.

Is it something to do with compiler gcc .I'm including all header and implementation files in one main file and then compiling the main file to get all the files be compiled.


IS it something to do with linker. the command i'm using to link is given in problem 1)

You may say that there might be a bug in code but then why code works fine when i remove array or reduce its size.Why code works ok in one funtion even when i keep the array of 360 bytes and doesn't work when called from some other function.

Re:strange problem

Posted: Fri Dec 19, 2003 5:51 am
by Tim
shahzad wrote:Please tell me why size increased so much by adding just one function.
(i think it has something to do with linker script)
Alignment. This is normal and you shouldn't worry about it. When you added your new function, your code binary spilled over into another page, consisting of a bit of code followed by lots of zeroes. You should find that if you add another function, the file will stay the same size, and the new code will fill up these zeroes.

"But why do I need this?!", you ask. "Isn't it a big waste of space?". Well, yes, it wastes space slightly. But alignment is good for two reasons:
  • It lets you apply separate protection to each code or data page. If the whole image was lumped together, then you wouldn't be able to apply page-based protection where it mattered. You'd get data going into code pages, or vice versa. So aligning whole sections to PAGE_SIZE (i.e. 4096 bytes) is good.
  • It allows for more efficient use of the processor's cache. The cache stores code or data in chunks of a certain size. If you can get one function, or block of data, into one chunk it's going to be more efficient. If two functions share a chunk, as they would if they weren't aligned, performance would go down. So alignment by the CPU's cache line size is also good.
...
You may say that there might be a bug in code but then why code works fine when i remove array or reduce the its size.Why code works ok in one funtion even when i keep the array of 360 bytes and doesn't work when called from some other function.
Wellllll... it sounds like either a coding error or a loading error. If this was regular programming I'd say, "you've overrun an array somewhere and you're writing over memory you shouldn't".

So things to check are:
  • Your loader is loading enough data
  • You loader is loading to the right place
  • All the hard-coded addresses are correct
  • Your floppy driver is reading data to the right place
  • Your floppy driver isn't reading too much data
  • Your DMA physical address is right
Things you could try are:
  • Moving this 360-byte array somewhere else in memory (put it in a different source file or something)
  • Using a different physical address for your DMA buffer
  • Using a different virtual address for the memory the floppy driver writes to
  • Applying protection to the code and data in the kernel. I caught quite a few bugs by making the code pages read-only, and by putting no-access pages at the beginning and end of vital structures (e.g. kernel stacks).

Re:strange problem

Posted: Fri Dec 19, 2003 7:50 am
by Pype.Clicker
when a strange problem occur involving an array, the following question pops up in my mind:
"where is the array, after all".

Knowing where it is help you understanding what could occur if you're overflowing its bounds... Especially for local variable arrays (which are kept on the stack), very weird things may occur -- including jump-to-wastelands.

Just by curiosity, what are those 360 bytes ?? you aren't trying to fit the 512 bytes sector through DMA into a 360 bytes array, aren't you ? that would be meant to failure.

Re:strange problem

Posted: Fri Dec 19, 2003 9:37 am
by richie
I also had a strange array problem and perhaps something similar happens to you. I defined a array in C but never used it in the C-Code (only in asm). So gcc (with default optimation) removed it because gcc thought it wasn't needed. You can disable this optimation to overcome this problem. Or easier: init the array in a loop with zeros.

Re:strange problem

Posted: Fri Dec 19, 2003 10:39 am
by Candy
Just for shooting the very obvious stuff, you might be overwriting some part of your boot sector code. 27K + 4k alignment seems awfully much like 7C00, where your boot sector resides.

Re:strange problem

Posted: Sat Dec 20, 2003 6:18 am
by neuRopac
i have a problem with loading kernel.........

ld -Ttext=0x100000 -o kernel.elf kernel_asm.o kernel.c

when i used this command in my prompt

i got a warning

cannot find entry _start:defaulting to 00100000

i was adviced not to consider this warning.....

and i tried to load my kernel using GRUB ......

i think kernel was loaded..(b'coz earlier i got two error repeatedly like

* loading kernel below 1mb not supported

*selected item cannot fit into memory)

this time i got no error....

but when i type 'boot' command in GRUB prompt system gets rebooted.....

what can i do further please help me out

Re:strange problem

Posted: Sat Dec 20, 2003 7:06 am
by Whatever5k
You can get rid of the "cannot find entry ..." warning by using a linker script which tells ld where the entry is positioned. And yes, GRUB seems to load and run your kernel, but your kernel seems to mess up some things which ends in a GPF (general protection fault), causing the system to reboot.

Re:strange problem

Posted: Sat Dec 20, 2003 3:57 pm
by shahzad
fter studying your responses,i disabled all functions that read or write to floppy to avoid any chance of code overwriting .I initialized all arrays to ensure that gcc doesn't omit any uninitilized arrays for default optimiozation.
Then i ran th code,everything went OK.So i thought that floppy driver might have written something over my code.
But when i increased the size of array from 360 bytes to 882 bytes ,system again started to restart.
When the array size was 360 bytes, size of kernel was 27336 bytes and it remained the same even after increasing the array size to 882 bytes.

Should kernel size remain same even after increase in array size??

Now i wana compile the code on DJGPP.
Can you please give me the commands to compile and link code over DJGPP in windows cuz uptill now i've been using gcc that ships with red hat linux 8.0.

Do think that this bug might have something to do with compiler?

Re:strange problem

Posted: Sat Dec 20, 2003 7:30 pm
by Tim
I think Candy has the most likely solution: your loader is loading your kernel to 0000_1000 linear (= 0000:1000). Once you get to 27KB, the loader will start overwriting itself (0000_0000 + 27KB = 0000_7C00).

If you're on Windows I'd recommend using Cygwin instead of DJGPP. It works a lot better than DJGPP does under Windows. Of course, if you're using real DOS, you can't use Cygwin, so this is when you should use DJGPP. Anyway, Cygwin, DJGPP and Redhat all use gcc, so the build process is mostly the same.

Re:strange problem

Posted: Sun Dec 21, 2003 9:59 am
by shahzad
Today i noticed that before the computer stops responding ,the keyboard LEDs flash just for a moment and then PC freezez.
Then i disabled the keyboard IRQ line.
Now the keyboard LEDs flashed for a moment and after that computer restarted.

Actually my OS has a GUI.The buggy code doesn't run immediately rather it runs after a mouse event on a dialog box.

So if i could have overwritten my bootloader ,then PC should have restarted immediately instead of after a mouse event.

Please tell me why keyboard LEDs are flashing?

Re:strange problem

Posted: Tue Dec 23, 2003 4:05 am
by Pype.Clicker
as soon as you have things like code overwriting, nothing else can be debugged as you cannot know what is actually being executed...

So the first steps i would do if i were you are:
- change your loader so that it loads the kernel to a 'safe' place (for instance 0x1000:0000 rather than 0000:0x1000)
- write a small "checksum" function that will sum up the values of all the dwords of the loaded kernel and compare them with the "expected" value before jumping to the kernel (trick: if you leave the last word of your kernel blank and write the (0-sum) there at build-time, the "expected" checksum will allways be zero ;)
- check how the things are assembled with a tool like objdump.
- make sure your loader or your kstart.asm stub wipes out the BSS section as it should (if you have a BSS section ;)

It's very usual for compiler to align sections on page boundaries, so it's not surprising that changing an array size result in no change in the code size.

moreover, if you have something like

Code: Select all

   function() {
      char array[2000];
   }
array is actually allocated at run-time on the stack and thus changing its size only changes a constant value in

Code: Select all

   sub esp,2000