Page 1 of 1

Some general unrelated questions

Posted: Fri Dec 26, 2014 7:01 am
by kemosparc
Hi,

I have been working on OS development for the past 1 & 1/2 year, and I have learned a lot from OSDEV. I also have been followng James Molly Tutorial vry closely but I tried the concepta and as much as the code of the tutorial that I can migrate to 64-bit.

I was able to write my own hobby OS for learning which included:
  • Booting my own boot loader
    Switching to 64-bit Long Mode
    Paging
    Memory Heaps
    IDT with PIC, RTC, and Keyboard
    A driver for Realtec network card on Qemu.
    A simple scheduler within the PIC
    IPI and interrupt forwarding from BSP to APs
And more.

I was using this environment as a sandbox for trying out things and playing around, and I did not intend by any means of having a good looking products. So what I have right now is not a pretty clean hobby OS, rather the code is dirty and not very optimum and organized.

I now feel that I have reached a rescratch point which I will need to write a cleaner version with the knowledge and the experimence I gained so far; I had this case more than once which is while I am adding a new thing I say for example if the memory mapper was implemented in a different way it would have been better.

Also, I reached a point were the code is more difficult to manage and this encourages me to start it from a clean ground.

Nevertheless, I would like to make use of what I have right now, through asking questions about problems and situations I am facing as well as situations that I would like and expect to have while I am starting the relatively clean second round.


I would appreciate if anyone has answers to all or subset of my questions below. I would also appreciate that people who are willing to respond do respond even if they find that others responded.

My questions are:

Q1: Is it possible that I enable PIC on the BSP and APIC on the APs. I have read that to be able to enable APIC you need to disable PIC, but I don't really see a conflict; PIC is configured to interrupt the BSP only, so is there a way to enable APIC on the APs?

Q2: In long mode, I undertand that memory is a contigeous space that can be used without any care for segmentation. Is there any restriction on the size of the code of my OS. Is there anything that need to be done in case my code exceeded 64 KB? As my kernel started to get bigger, weird inconsistent symptoms started to happen. For example, some methods in my classes points to memory locations outside the code segment, so whenever I call such methods I get page faults .

Q3: Some times Qemu reboots unexpectedly. When I looked into the problem I dicovered that it happens during the initialization of the PIC. This does not happen consistently, it the PIC initialization is passed everything is stable and the kernel can run for hours. I have discovered that this happens between initailizing the PIC and setting the flags of all the interrupts to mask it. The below code shows what I mean:

Ports::outportb(PIC1_DATA, 0xff);
Ports::outportb(PIC2_DATA, 0xff);

Ports::outportb(0x20, 0x11);
Ports::outportb(0xA0, 0x11);
Ports::outportb(0x21, 0x20);
Ports::outportb(0xA1, 0x28);
Ports::outportb(0x21, 0x04);
Ports::outportb(0xA1, 0x02);
Ports::outportb(0x21, 0x01);
Ports::outportb(0xA1, 0x01);

// If some relatively long time takes place here Qemu reboots. I over imposed the situation by inserting an infinite loop here
// for(;;); -> this cause the problem, and without it the problem occurs every now and then.
for ( unsigned char c = 0 ; c < 48 ; c++)
irqSetMask(c);
Q4: The important question about Q3 is that I cannot debug this with Qemu, as it does not crash it just keep on rebooting when the problem occurs. Also, it is very difficult to debug with Qemu+GDB on 64-bit. I have read multiple posts on this forum about Qemu+x86_64+GDB and I gave it some time trying making it work but I could not get it to work.

Q5: My kernel runs when I use --enable-kvm while it crashes when I remove this flag from the Qemu Command line.

Q6: How can I use Bochs to run my OS with more than 2048 MB. Is there a way?

Thanks a lot.
Karim

Re: Some important questions

Posted: Fri Dec 26, 2014 2:48 pm
by Combuster
Some important questions
That's not at all a descriptive topic title, and most certainly you're not more important than anybody else here.
Q1: Is it possible that I enable PIC on the BSP and APIC on the APs.
You can't enable any other CPU's local APIC without enabling the BSP's local APIC. You'll also find that you can't decently use the IOAPIC without getting possible duplicate interrupts to your BSP as a consequence. Besides, the only reason the BSP responds to the PIC can be because the other CPUs have interrupts disabled.
Q2: In long mode
There is no segmentation, and always paging. Physical memory is non-contiguous. 64K limits sound like you have involved 16-bit code somewhere and created bugs as a result.
Q3: Some times Qemu reboots unexpectedly
I'm pretty sure the cause is in the FAQ. Use Bochs to get a nicer description.
Q4: The important question about Q3 is that I cannot debug this with Qemu, as it does not crash it just keep on rebooting when the problem occurs
You can't, we can.
it does not crash it just keep on rebooting
And a reboot is...?
Q5
Congratulations, you just made your own bugs reproducable.
Q6: How can I use Bochs to run my OS with more than 2048 MB. Is there a way?
RTFM [-X

Re: Some general unrelated questions

Posted: Fri Dec 26, 2014 3:44 pm
by kemosparc
Hi,

First of all thanks for your reply.

I would like to clarify that I did not mean to imply that I am an important person more than anyone, what I meant is that the questions are important to me.

So, I am sorry if you got this impression.

Consequently I have changed the title of my post and you can do so on your reply if you want.

Anyways, its the last day of the week and people reach it in a tired state, and might get offended from things that were not meant to offend them; it is not the intention.

I will look into your replies in details over the week-end.

Thanks again.
Karim.

Re: Some general unrelated questions

Posted: Fri Dec 26, 2014 4:10 pm
by Combuster
Some general unrelated questions
Still a bad topic title. You have a crashbug and you ask 101 obviously related questions that might help fix something while being very keen on hiding your actual problem.

Re: Some general unrelated questions

Posted: Fri Dec 26, 2014 4:12 pm
by kemosparc
:)

I will give you the honor to choose the title you like.

Just send me the title you think appropriate and I will put it as is.

Thanks
Karim.

Re: Some general unrelated questions

Posted: Fri Dec 26, 2014 6:35 pm
by KemyLand
kemosparc wrote::)

I will give you the honor to choose the title you like.

Just send me the title you think appropriate and I will put it as is.

Thanks
Karim.
as I understood them
Relax a little bit! Do not get intimidated by moderators, we're a free forum :D! I'll just leave Combuster's tendency to give so much importance to unrelated details aside and give an answer to your original questions. You have the full right to get your questions answered, and this wasn't the path this thread was following. It's simply ironical that he's saying "the title is unrelated". Anyway, to the real objective:

Q1: Pointer to Combuster's answer :P

Q2: This smells like segmented 16-bit code (in long mode?). We're probably misunderstanding this question. Please rework it so we can understand what's happening and what is that 64KiB boundary.

Q3: May we see the source of Ports and irqSetMask? This has a pretty good appearance of a race condition. I don't think Bochs will be a better help compared to QEMU+GDB. For instance, I used the later combination.

Q4: I do understand what you mean with doesn't crashes but resets. QEMU is not perfect when it comes to debugging. It does what a physical PC does on triple fault: reboot. Bochs in the other hand dumps debugging information and aborts. For me, this is a really annoying feature about QEMU. My workaround is to run GDB, start my kernel, stop, add a breakpoint at GRUB's page fault handler. If that breakpoint is reached, I look into the stack frames to find the guilty function. Obviously, this strategy requires you to try the malfunctioning code before your IDT is loaded. In your case this won't work, so I can't recommend anything more about this question in this post, until I recieve your answer :roll: . Also, you're saying you can't debug with QEMU+x86_64+GDB. Why?

Q5: KVM (QEMU Kernel Virtual Machine) is a Linux-specific feature. What the hell are you doing that depends on this? Are you telling QEMU to give you the specifications of the hardware it is running on?

Q6: Another useless pointer to Combuster's answer 8)

Re: Some general unrelated questions

Posted: Fri Dec 26, 2014 7:00 pm
by kemosparc
Hi,

Thanks a lot for the answer :) And really I am not antiminated, I just did not want to offend anyone, as I really appreciate the help you guys are providing. Also consider that we are not all native english speakers and we might say things that might hit different meanings.

Anyways, back to the real objective.

First, you are right I had an over looked "sti" in my Ports::outportb(0x20, 0x11). I added it some time ago as I was testing something and forgot to remove it. Actually, I was able to detect it after following Combuster's answer to Q4 and used "-d cpu_reset", and this showed me the address at which the problem occurs which I looked up it from within objdump. (So thank Combuster for that)

Q2: Some of my methods in my code when invoked take me to weird addresses, but I relate this to the previous point and I am going to test that once again after I have fixed the faulty code.


Now, I am stuck in the problem I mentioned in Q5, and I really do need help in this. Here is the status right now.

It is all based on a static function that I call to get the CPU ID from location 0xFEE00000, here is the method

Code: Select all

uint32_t Utils::readLocalAPIC(uint64_t reg)
{
    return (uint32_t)(*(uint32_t*)((uint64_t) reg));
}
and I call it like that:

Code: Select all

        uint32_t cpuid = Utils::readLocalAPIC(0xFEE00020);
When I am using --enable-kvm switch this call return the correct core id, but if I ommit the switch it always return zero for any core, and this has a lot of impact on the code as I have an array representing cores, and in this case everytime I start up one of the cores and lookup the ID to set some core parameter in the correct array entry, it overwrites values in the entry corresponding to the BSP and this creates problems. So the final symtom is expected now after discovering this behaviour.

I don't know why this works in case of --enable-kvm and does not otherwise.

Could it be that in case of not using KVM the APIC address is at a different location? QEMU put it in another location ?

Thanks,
Karim.

Re: Some general unrelated questions

Posted: Fri Dec 26, 2014 7:23 pm
by KemyLand
When I am using --enable-kvm switch this call return the correct core id, but if I ommit the switch it always return zero for any core, and this has a lot of impact on the code as I have an array representing cores, and in this case everytime I start up one of the cores and lookup the ID to set some core parameter in the correct array entry, it overwrites values in the entry corresponding to the BSP and this creates problems. So the final symtom is expected now after discovering this behaviour.
Mmm... Maybe this is caused due to QEMU's default behaviour. As far as I know (correct me someone if I'm wrong), QEMU defaults to a single CPU, single core virtual machine. With --enable-kvm, QEMU provides the virtual machine with your native machine's capabilities. This includes multiple CPU's/cores. APIC is only useful on multiple CPU/core systems, thus QEMU may have it disabled by default. Check the 9th bit in EDX when calling CPUID to check for APIC availability. This is probably your problem.

Re: Some general unrelated questions

Posted: Fri Dec 26, 2014 9:32 pm
by sortie
See also http://wiki.osdev.org/James_Molloy%27s_ ... Known_Bugs as you confessed having used the tutorial.

Re: Some general unrelated questions

Posted: Sat Dec 27, 2014 10:15 pm
by eryjus
For Q3, do you have a proper handler installed for IRQ7? To me, this sounds suspiciously like a spurious interrupt between setting up your timer and enabling IRQ0.