ph34r 4ss3mbly bUgZ

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
User avatar
Pype.Clicker
Member
Member
Posts: 5964
Joined: Wed Oct 18, 2006 2:31 am
Location: In a galaxy, far, far away
Contact:

ph34r 4ss3mbly bUgZ

Post by Pype.Clicker »

Here's a small story i want to bring to your knowledge.

I initially started with an assembly version of Clicker... Things were small and simple and so everything seemed fine (seemed only because about 90% of that code was proven buggy when porting it in C).
And one day, i decide to move stuffes to C, leaving ASM just performing very early initialisation and stuff...

Last week-end, i've been facing a bunch of transient errors, including page faults in the page fault handler, non-planned restart of the init() function while a sub-function was running, etc ...

All these bugs had a single reason : i got a structure allocated in the assembly module ... That core/kernel.asm file was setting up a kind of "system info" structure and initialized some of its components before calling the C code ... unfortunately, the content and the length of that structure (from the C code) has changed alot and those changes weren't reflected in the ASM code ...

moral of the story, ph34r the assembly ... make it do as little things as you can, and never assume ASM knows C structures ... Write accessors and mutators for your C structures if they'll be accessed from assembly ...
DarylD

Re:ph34r 4ss3mbly bUgZ

Post by DarylD »

I agree...

I am using hardly any assembly now apart from a little Boot.S file that sets up as little as possible.

Although, fear C too!!

After having had a relatively rock solid kernel for a few weeks, its not decided to start generating strange stack faults and GPF's as my thread count increases to the massive number of 3! But these threads are actually doing something, in my case passing messages back and forth.

Strange this is, if I set the pre-empt time to 1 second (to slow things up to catch bugs) it works perfectly, if I set it to 1000Hz (which will be the final value) I start getting strange exceptions, for example, triple faults, which makes no sense as my IDT etc is all working perfectly normally, in fact my kernel has been paging in zero-fill pages for a long while now.

All I can think is, there is something seriously flawed in my scheduler, but its hard to track down when even bochs can't help me debug it.

If I turn on all my debugging messages...it seems to work ok....hmmm....

* scratches head *
Slasher

Re:ph34r 4ss3mbly bUgZ

Post by Slasher »

Yep,
Darlyd, same thing happens to me when i set the timer to 1000hz, the kernel from time to time just dies or starts misbehaving at random. But at 100hz, it seems fine.
My scheduler craches the kernel if i try to record running time of tasks or try to use quantum counts to decide if the task should continue to run or be changed. Have to find out why!!
distantvoices
Member
Member
Posts: 1600
Joined: Wed Oct 18, 2006 11:59 am
Location: Vienna/Austria
Contact:

Re:ph34r 4ss3mbly bUgZ

Post by distantvoices »

couldn't it be some overflow that causes the scheduler to crash?

check the variable types and their length. I consider those ticks consumed in total or per process should be counted by using a long field or so

stay safe
... the osdever formerly known as beyond infinity ...
BlueillusionOS iso image
DarylD

Re:ph34r 4ss3mbly bUgZ

Post by DarylD »

I think its more to do without leaving the stack in a bad state in my case, so I need to find out why. I think its probably a single error or mistake in case somewhere, just need to track it down.

I have quite extensive locking in use, which currently is designed to totally lock the machine if a lock is attempted to be locked before being unlocked. A task switch is designed not to occur while locks are held. I think there could be a problem with this too...problems, problems!

Daryl.
User avatar
Pype.Clicker
Member
Member
Posts: 5964
Joined: Wed Oct 18, 2006 2:31 am
Location: In a galaxy, far, far away
Contact:

Re:ph34r 4ss3mbly bUgZ

Post by Pype.Clicker »

My clock handler works fine at 1KHz on real hardware, but BOCHS seems to be unable to follow that speed (it will just have a new IRQ0 before the previous handler is done and therefore leaves no time for normal processing ...)
DarylD

Re:ph34r 4ss3mbly bUgZ

Post by DarylD »

Ahh!

Wonder if that is what is causing my problem, I thought it may be to do with my PIC code not masking properly because I was getting timer interrupts interrupting my timer interrupts once interrupts were re-enabled in the handler (does that make sense!)

Time to investigate more.

Daryl.
User avatar
Pype.Clicker
Member
Member
Posts: 5964
Joined: Wed Oct 18, 2006 2:31 am
Location: In a galaxy, far, far away
Contact:

Re:ph34r 4ss3mbly bUgZ

Post by Pype.Clicker »

Yep. If you send "out 0x20,0x20" and sti too early in your timer interrupt, then you could have a second timer IRQ hooking your handler, and recursively filling the stack unless the stack limit is reached (Stack Fault), a missing page is encountered (PAge Fault, very likely to tripple fault) or the code is erased (likely to tripple fault as well ;)
DarylD

Re:ph34r 4ss3mbly bUgZ

Post by DarylD »

So whats the solution Pype? Is there one? I suppose not if its fine on a real PC but being buggered up by bochs.

Daryl.
distantvoices
Member
Member
Posts: 1600
Joined: Wed Oct 18, 2006 11:59 am
Location: Vienna/Austria
Contact:

Re:ph34r 4ss3mbly bUgZ

Post by distantvoices »

I would disable the timer interrupt by setting the corresponding bit in the pic (you know which one I mean) so the interrupt is masked and not recognized any more. This having done you can safely reenable interrupts. After the timer interrupt is done, just unmask the interrupt line so the next timer interrupt can be handled. This way, you can nest some hardware interrupts.

stay safe.
... the osdever formerly known as beyond infinity ...
BlueillusionOS iso image
User avatar
Pype.Clicker
Member
Member
Posts: 5964
Joined: Wed Oct 18, 2006 2:31 am
Location: In a galaxy, far, far away
Contact:

Re:ph34r 4ss3mbly bUgZ

Post by Pype.Clicker »

i picked up the option of keeping all interrupts disabled while servicing IRQ0...

... And to keep a slow IRQ0 when using BOCHS (a compile-time flag allow you to restore the full speed if needed so).
DarylD

Re:ph34r 4ss3mbly bUgZ

Post by DarylD »

beyond infinity:

I am already masking the interrupt, it makes no difference.

Pype:

Yes I have just decided to slow the timer down while working with bochs. Seems the easiest solution for now.

Daryl.
beyondsociety

Re:ph34r 4ss3mbly bUgZ

Post by beyondsociety »

Yes, it is a wise decision to write as little code in assembly and more in the language you want to use.

I thought I had a problem with my c Kernel and in fact after several days of trying to figure out the problem, I realized there was something wrong with my assembly bootloader. Its fixed now.
Post Reply