No Idea How To Debug This
No Idea How To Debug This
Hello first post here!
Full disclosure, this is for a class. I am having a bit of a problem with my OS.
Everything works within the kernel, however when I try to run a program outside
the kernel (the shell), I seem to experience a plethora of undefined behavior.
I defined a custom service routine for interrupt 21, and it works fine in the kernel,
however it seems to cause a processor panic when called from the shell. Loops seem
to cause undefined behavior in the shell as well. I tried to get help with another programmer
with this, and although he couldn't figure this out either he seemed to get a processor panic
when he tried to do a loop in the shell (however I did not). Other interrupts seem to work
fine though for the most part. I think this has something to do with the IVT, and more
specifically interrupt 21, however I am unsure about this as when the ISR for interrupt 21 is not
initialized and not called similar problems can still arise. Even if it does have something to do with the IVT
I do not know how I would go about debugging it (tried looking through the memory view on the emulator
but I am unsure if I was looking in the right place), and if its not the IVT I have no idea what it is.
I have been stuck on this issue for quite a while and need to move on in the assignment,
(professor is unavailable) if anyone can help me figure out how to debug this or figure out
what the problem is it would be very helpful, here is the os https://nofile.io/f/0ewTS042E9Y/OS.zip
and this is the emulator its meant to target: https://github.com/mdblack/simulator
The compiler is bcc (Bruce's C Compiler), all the build scripts included should work on linux and there are debug.bat and
build.bat for Windows Subsystem for Linux.
Thank you
- cgbsu
P.s For some reason it needs memcpy, even though I dont call it, there is no c standard library linked
and I dont call it, but this seems to be some sort of optimization or something, so I implemented it, if
anyone knows how to get bcc to stop doing this, please let me know. Iv wondered if it has something to do with this.
Full disclosure, this is for a class. I am having a bit of a problem with my OS.
Everything works within the kernel, however when I try to run a program outside
the kernel (the shell), I seem to experience a plethora of undefined behavior.
I defined a custom service routine for interrupt 21, and it works fine in the kernel,
however it seems to cause a processor panic when called from the shell. Loops seem
to cause undefined behavior in the shell as well. I tried to get help with another programmer
with this, and although he couldn't figure this out either he seemed to get a processor panic
when he tried to do a loop in the shell (however I did not). Other interrupts seem to work
fine though for the most part. I think this has something to do with the IVT, and more
specifically interrupt 21, however I am unsure about this as when the ISR for interrupt 21 is not
initialized and not called similar problems can still arise. Even if it does have something to do with the IVT
I do not know how I would go about debugging it (tried looking through the memory view on the emulator
but I am unsure if I was looking in the right place), and if its not the IVT I have no idea what it is.
I have been stuck on this issue for quite a while and need to move on in the assignment,
(professor is unavailable) if anyone can help me figure out how to debug this or figure out
what the problem is it would be very helpful, here is the os https://nofile.io/f/0ewTS042E9Y/OS.zip
and this is the emulator its meant to target: https://github.com/mdblack/simulator
The compiler is bcc (Bruce's C Compiler), all the build scripts included should work on linux and there are debug.bat and
build.bat for Windows Subsystem for Linux.
Thank you
- cgbsu
P.s For some reason it needs memcpy, even though I dont call it, there is no c standard library linked
and I dont call it, but this seems to be some sort of optimization or something, so I implemented it, if
anyone knows how to get bcc to stop doing this, please let me know. Iv wondered if it has something to do with this.
-
- Member
- Posts: 5586
- Joined: Mon Mar 25, 2013 7:01 pm
Re: No Idea How To Debug This
When the provided disk image loads your "shell" program, it places the stack inside the EBDA. (See here for details.) The BIOS relies on the EBDA having specific contents, and it will misbehave if they're overwritten. Additionally, when the BIOS writes to the EBDA, it may be overwriting your program's stack.
I also saw some self-modifying code that doesn't clear the prefetch queue. It may fail on some CPUs.
I'm not sure if there are any other problems; I can't debug any further with an unreliable stack.
I also saw some self-modifying code that doesn't clear the prefetch queue. It may fail on some CPUs.
I'm not sure if there are any other problems; I can't debug any further with an unreliable stack.
Re: No Idea How To Debug This
Octocontrabass wrote:When the provided disk image loads your "shell" program, it places the stack inside the EBDA. (See here for details.) The BIOS relies on the EBDA having specific contents, and it will misbehave if they're overwritten. Additionally, when the BIOS writes to the EBDA, it may be overwriting your program's stack.
I also saw some self-modifying code that doesn't clear the prefetch queue. It may fail on some CPUs.
I'm not sure if there are any other problems; I can't debug any further with an unreliable stack.
I am unsure if I am overwiting anything, however we don't enter protected mode and he said the following:
"The segment should be a multiple of 0x1000 (remember that a segment of 0x1000 means a base memory location of 0x10000). 0x0000 should not be used because it is reserved for interrupt vectors. 0x1000 also should not be used because your kernel lives there and you do not want to overwrite it. Segments above 0xA000 are unavailable because the original IBM-PC was limited to 640k of memory."
I assume nothing aside from the regions he mentioned has anything that could easily be corrupted. I have experimented with changing the segment
but not to much avail.
-
- Member
- Posts: 5586
- Joined: Mon Mar 25, 2013 7:01 pm
Re: No Idea How To Debug This
I am very certain that you're overwriting the EBDA. Since you're not switching to protected mode, the BIOS interrupt handlers can still access the EBDA. (And on real hardware, the BIOS will use SMM to access the EBDA regardless of CPU mode.)cgbsu wrote:I am unsure if I am overwiting anything, however we don't enter protected mode and he said the following:
That's not a requirement of the hardware, but it makes it easier to keep track of which portions of memory you're using and avoids trouble with ISA DMA.cgbsu wrote:"The segment should be a multiple of 0x1000 (remember that a segment of 0x1000 means a base memory location of 0x10000).
It also contains the BDA, another structure that you must not overwrite.cgbsu wrote:0x0000 should not be used because it is reserved for interrupt vectors.
Your assumption is incorrect. Your simulator is using the Bochs BIOS, which places the EBDA at address 0x9FC00 by default. (The location may change depending on how it's configured.)cgbsu wrote:I assume nothing aside from the regions he mentioned has anything that could easily be corrupted.
That means there are additional problems, so I've decided to take another look. Your shell program returns from main! How can it return with no return address on the stack?cgbsu wrote:I have experimented with changing the segment
but not to much avail.
Re: No Idea How To Debug This
I don't mean to sound like I'm trying to oppose what your saying, I'm just trying to figure out how to solve this problem. I'm somewhat confusedOctocontrabass wrote:I am very certain that you're overwriting the EBDA. Since you're not switching to protected mode, the BIOS interrupt handlers can still access the EBDA. (And on real hardware, the BIOS will use SMM to access the EBDA regardless of CPU mode.)cgbsu wrote:I am unsure if I am overwiting anything, however we don't enter protected mode and he said the following:
That's not a requirement of the hardware, but it makes it easier to keep track of which portions of memory you're using and avoids trouble with ISA DMA.cgbsu wrote:"The segment should be a multiple of 0x1000 (remember that a segment of 0x1000 means a base memory location of 0x10000).
It also contains the BDA, another structure that you must not overwrite.cgbsu wrote:0x0000 should not be used because it is reserved for interrupt vectors.
Your assumption is incorrect. Your simulator is using the Bochs BIOS, which places the EBDA at address 0x9FC00 by default. (The location may change depending on how it's configured.)cgbsu wrote:I assume nothing aside from the regions he mentioned has anything that could easily be corrupted.
That means there are additional problems, so I've decided to take another look. Your shell program returns from main! How can it return with no return address on the stack?cgbsu wrote:I have experimented with changing the segment
but not to much avail.
as to why everyone else in the class used this region of memory (that he told us to use, so I'm assuming his version uses this
region as well) as well but had no problems but I am. I'm thinking the possibly simplest solution may just be to try to find a way to go to
protected mode, which I'm not sure if that is what your proposing.
If its not I just ran a test:
If I'm not misunderstanding, the EBDA is an area of memory that contains data structures in certain parts of it. According to what you said, it should be free from the end of the kernel to 0x9FC00. The wiki page you linked said that there is a guaranteed space of free memory at 0x7E00, I tried loading the program there and still found issues.
Also Im trying to figure out how the shell returned (could it be one of the interrupts putting something into AL?) I put no return's in the shell program.
-
- Member
- Posts: 5586
- Joined: Mon Mar 25, 2013 7:01 pm
Re: No Idea How To Debug This
I might not have made myself clear. Placing the stack in the EBDA is just one of the problems I found, but I don't know if it's the reason why your program doesn't work.cgbsu wrote:I don't mean to sound like I'm trying to oppose what your saying, I'm just trying to figure out how to solve this problem. I'm somewhat confused
as to why everyone else in the class used this region of memory (that he told us to use, so I'm assuming his version uses this
region as well) as well but had no problems but I am.
I'm not. Switching to protected mode is not simple either; I think you can find an easier solution.cgbsu wrote:I'm thinking the possibly simplest solution may just be to try to find a way to go to
protected mode, which I'm not sure if that is what your proposing.
In C, the return statement is optional for functions that return void. A return statement is implied at the end of the function.cgbsu wrote:Also Im trying to figure out how the shell returned (could it be one of the interrupts putting something into AL?) I put no return's in the shell program.
Re: No Idea How To Debug This
Thank you for the reply.
I understand now, I think it most likely is not the reason, at least
not entirely given that it is as difficult to enter protected mode as
you said and the EBDA is 0x0 to 0x000FFFFF according to
the wiki page and other sources, and 0xFFFF is the max sector addressable
by a 16 bit value inputted into int 13. If not going into protected mode,
there isen't a way to not write within the EBDA (if you're going to write something).
As for the return, bcc is being used so main has no return type.
I thought this may be a semantic difference put there on purpose
to imply that main is not returning, but it may be, if so I guess
I would need somewhere to put that data?
Octocontrabass wrote:I might not have made myself clear. Placing the stack in the EBDA is just one of the problems I found, but I don't know if it's the reason why your program doesn't work.cgbsu wrote:I don't mean to sound like I'm trying to oppose what your saying, I'm just trying to figure out how to solve this problem. I'm somewhat confused
as to why everyone else in the class used this region of memory (that he told us to use, so I'm assuming his version uses this
region as well) as well but had no problems but I am.
I'm not. Switching to protected mode is not simple either; I think you can find an easier solution.cgbsu wrote:I'm thinking the possibly simplest solution may just be to try to find a way to go to
protected mode, which I'm not sure if that is what your proposing.
I understand now, I think it most likely is not the reason, at least
not entirely given that it is as difficult to enter protected mode as
you said and the EBDA is 0x0 to 0x000FFFFF according to
the wiki page and other sources, and 0xFFFF is the max sector addressable
by a 16 bit value inputted into int 13. If not going into protected mode,
there isen't a way to not write within the EBDA (if you're going to write something).
Octocontrabass wrote:In C, the return statement is optional for functions that return void. A return statement is implied at the end of the function.cgbsu wrote:Also Im trying to figure out how the shell returned (could it be one of the interrupts putting something into AL?) I put no return's in the shell program.
As for the return, bcc is being used so main has no return type.
Code: Select all
main() {
/*Code goes here.*/
}
to imply that main is not returning, but it may be, if so I guess
I would need somewhere to put that data?
-
- Member
- Posts: 5586
- Joined: Mon Mar 25, 2013 7:01 pm
Re: No Idea How To Debug This
I'm not sure what you're talking about. The EBDA is 0x9FC00 to 0x9FFFF in your simulator, with similar addresses on other computers. Most of the rest of memory, from 0x600 to 0x9FBFF, is free for your OS and programs. Sector addresses are irrelevant here since these are memory addresses.cgbsu wrote:I understand now, I think it most likely is not the reason, at least
not entirely given that it is as difficult to enter protected mode as
you said and the EBDA is 0x0 to 0x000FFFFF according to
the wiki page and other sources, and 0xFFFF is the max sector addressable
by a 16 bit value inputted into int 13. If not going into protected mode,
there isen't a way to not write within the EBDA (if you're going to write something).
In a more complete OS, the C library would provide a wrapper function that calls main, so main has something to return to. The wrapper function doesn't return. Instead, it uses a system call to tell the kernel to end the program after main returns.cgbsu wrote:As for the return, bcc is being used so main has no return type.I thought this may be a semantic difference put there on purposeCode: Select all
main() { /*Code goes here.*/ }
to imply that main is not returning, but it may be, if so I guess
I would need somewhere to put that data?
Since you don't have a wrapper function like that, you can't let main return.
Re: No Idea How To Debug This
Octocontrabass wrote:I'm not sure what you're talking about. The EBDA is 0x9FC00 to 0x9FFFF in your simulator, with similar addresses on other computers. Most of the rest of memory, from 0x600 to 0x9FBFF, is free for your OS and programs. Sector addresses are irrelevant here since these are memory addresses.cgbsu wrote:I understand now, I think it most likely is not the reason, at least
not entirely given that it is as difficult to enter protected mode as
you said and the EBDA is 0x0 to 0x000FFFFF according to
the wiki page and other sources, and 0xFFFF is the max sector addressable
by a 16 bit value inputted into int 13. If not going into protected mode,
there isen't a way to not write within the EBDA (if you're going to write something).
I was viewing it incorrectly, I thought 0x0 to 0xFFFFF was the EBDA and 0x04 - 0x0497 was the BDA with 0x0 to 0xA0000 being the part with the most
stuff crammed into it (basically I was thinking as the EBDA as the larger category encompassing these things) -- my bad.
There are commented out/commented while( 1 ); 's at the end of both the kernel and shell's mains, the gunk that's in both main procedures is test code and I have commented it out and uncommented it a bunch. Both while( 1 )'s (which are there for similar reasons, though the professor told us it had to do with interpreting the next piece of memory as an instruction) have been uncommented simultaneously usually just giving different undefined behavior.Octocontrabass wrote:In a more complete OS, the C library would provide a wrapper function that calls main, so main has something to return to. The wrapper function doesn't return. Instead, it uses a system call to tell the kernel to end the program after main returns.cgbsu wrote:As for the return, bcc is being used so main has no return type.I thought this may be a semantic difference put there on purposeCode: Select all
main() { /*Code goes here.*/ }
to imply that main is not returning, but it may be, if so I guess
I would need somewhere to put that data?
Since you don't have a wrapper function like that, you can't let main return.
-
- Member
- Posts: 5586
- Joined: Mon Mar 25, 2013 7:01 pm
Re: No Idea How To Debug This
Those infinite loops prevent the main function from returning with nothing to return to. You should leave them uncommented.cgbsu wrote:There are commented out/commented while( 1 ); 's at the end of both the kernel and shell's mains, the gunk that's in both main procedures is test code and I have commented it out and uncommented it a bunch. Both while( 1 )'s (which are there for similar reasons, though the professor told us it had to do with interpreting the next piece of memory as an instruction) have been uncommented simultaneously usually just giving different undefined behavior.
Aside from that and the EBDA thing, I didn't see any other problems. I'd like to see a disk image rebuilt to fix those two issues, but I'm away from my development system for the rest of the week so I wouldn't be able to debug it until then.
Re: No Idea How To Debug This
Did you get a chance to revisit it yet?Octocontrabass wrote:Those infinite loops prevent the main function from returning with nothing to return to. You should leave them uncommented.cgbsu wrote:There are commented out/commented while( 1 ); 's at the end of both the kernel and shell's mains, the gunk that's in both main procedures is test code and I have commented it out and uncommented it a bunch. Both while( 1 )'s (which are there for similar reasons, though the professor told us it had to do with interpreting the next piece of memory as an instruction) have been uncommented simultaneously usually just giving different undefined behavior.
Aside from that and the EBDA thing, I didn't see any other problems. I'd like to see a disk image rebuilt to fix those two issues, but I'm away from my development system for the rest of the week so I wouldn't be able to debug it until then.
-
- Member
- Posts: 5586
- Joined: Mon Mar 25, 2013 7:01 pm
Re: No Idea How To Debug This
I did, but I don't see any other issues. I'd have to see your current code to be able to figure out why it still doesn't work.
Re: No Idea How To Debug This
It hasen't changed, sometimes I comment out makeInterrupt21 in KernelInitizlize() and I mess around with the main()'s commenting in and out stuff.Octocontrabass wrote:I did, but I don't see any other issues. I'd have to see your current code to be able to figure out why it still doesn't work.
-
- Member
- Posts: 5586
- Joined: Mon Mar 25, 2013 7:01 pm
Re: No Idea How To Debug This
If you'd like me to debug further, please provide a build that incorporates fixes to the two issues I mentioned earlier:
- The stack overlapping the EBDA due to loading the shell at 0x90000
- The shell program returning from main() instead of halting with an infinite loop
Re: No Idea How To Debug This
DoneOctocontrabass wrote:If you'd like me to debug further, please provide a build that incorporates fixes to the two issues I mentioned earlier:
- The stack overlapping the EBDA due to loading the shell at 0x90000
- The shell program returning from main() instead of halting with an infinite loop
https://nofile.io/f/50HTF9goGwV/OS1.zip
Also I tried to redo the project according to the professors simpler guidelines and I am getting similar problems:
https://nofile.io/f/sDawmZQ0QXC/OS2.zip