Page 1 of 1

[SOLVEDish] Behaviour changes based on presence of prints

Posted: Mon Sep 11, 2017 11:03 am
by isaacwoods
I'm building an OS in Rust, which has been a bit of an ordeal so far. I have managed to load a GRUB module and am now trying to identity map it from information in the Multiboot header. Note that I am already identity-mapping other things, such as the multiboot structure itself and the VGA buffer, so I'm fairly confident my mapping code does what it's meant to. However, in trying to map the range used by the GRUB modules, it fails and says that the P1 entries have already been mapped (which I don't think should be possible). Thinking this was a legitimate error on my part, I started adding `println`s to map the memory layout used by the kernel.

This is where it started to get weird; two of the `println`s (which should just be printing data held in a structure, not doing anything fancy) stop the error! The pages are mapped successfully and the code works (in this case executing the code loaded in the module). I know this is very little to go on, so sorry about that, and I guess I'm more interested in how you'd go about debugging this? It screams at me that something more serious is wrong with the memory layout or something, but I'm a bit out of my depth here (but would love to learn how to do this sort of debugging).

https://github.com/IsaacWoods/RustOS/bl ... #L265-L293 is the relavant piece of code to just give a feel, although this uses stuff all over the paging system. I realise this isn't a lot to go on and I'm very anxious not to appear like a noob not doing his own research or not being experienced enough for OsDev, I have run this through a debugger (a custom-patched gdb) which didn't really help and have literally no idea why printing stuff out would change anything at all. Any help at all would be appreciated!

Re: Behaviour changes based on presence of prints

Posted: Mon Sep 11, 2017 11:18 am
by Korona
Since you're not working with IRQs or multiple CPUs (correct me if I'm wrong, I did not read the whole code) we can rule out race conditions, which are a common cause of such behavior.

The other prevailing reason for different results when making trivial changes to the code is undefined behavior. Make sure that your code does not exploit undefined behavior. For example, ensure that it does not violate aliasing rules. Also make sure that stores to memory are annotated with appropriate barriers to ensure that the compiler does not optimize away stores and to prevent the compiler from reordering memory accesses.

Re: Behaviour changes based on presence of prints

Posted: Tue Sep 12, 2017 11:21 am
by isaacwoods
I do have working interrupt stuff including the beginnings of a keyboard driver, but interrupts are disabled at this point. After almost burning out however, I think I've found the culprit!

I don't think it has anything to do with the prints themselves, but rather the code they generate. It obviously increases the size of the kernel and without them, GRUB maps the multiboot structure after the kernel and before the modules. However, with the increased kernel size due to the prints, I guess it doesn't fit there and so the multiboot structure is mapped directly after the modules (in the same page as them, in fact). This leads to me trying to map a certain page twice (the last page the modules occupy, and the first page the multiboot structure lies in), which fails because the paging system doesn't differentiate between just identity mapping stuff and trying to actually map it, and so was trying to protect against two things trying to map the same frame.

I am not sure of an elegant way to ensure that I identity-map each frame once, even when two structures lie in addresses on the same frame, but hopefully that should fix the problem. Thank you for your ideas, and sorry for the red herring description and not noticing this sooner, seems simple in hindsight.