Page 1 of 1

making the transition to real hardware

Posted: Tue Jul 21, 2009 9:04 pm
by Matthew
I've been testing and debugging my SMP OS on bochs, qemu, and vmware workstation with great success. Today I burned it to a CD and tried to boot on a real machine, a Core2 Quad. I had some false starts (like, who knew setting the IMCR could be so harmful?) but I managed to boot and run a test successfully -- once. Since then I've had page faults mixed with dead locks and its hard to get consistent results. I'm going to try CD-RWs tomorrow so that at least I'm not wasting so many CDs. I implemented serial port logging for output from the emulators but I've come to realize that none of my machines have serial ports. At this point I've got printf() and a whole lot of guessing to go on. Does anyone have any tips for dealing with this kind of situation? Is there some way of making any of the emulators "more realistic" so that I can get a feel for the problem in a debuggable environment?

Re: making the transition to real hardware

Posted: Tue Jul 21, 2009 10:09 pm
by Brendan
Hi,
Matthew wrote:I implemented serial port logging for output from the emulators but I've come to realize that none of my machines have serial ports.
Unless you're using nothing but laptops, you might want to check the motherboard - desktop/server motherboards that don't have serial ports are *extremely* rare. It's possible that the motherboard has serial ports, but whoever built the computer didn't install the cable/adapter from the motherboard to serial port socket on the back of the case. Maybe you could find a suitable cable/adapter and simply plug it into the motherboard. For example:

Image

You might need to do a little research for this though, as different motherboards use different connections, and an adapter (like the picture above) for one motherboard might not work for another motherboard.

Another alternative would be to buy an I/O card. You can get them for almost all types of bus (ISA, PCI, PCI express, etc), and something with a parallel port and 2 serial ports usually only costs about $25 (Aust).

Of course another idea would be to simply buy another computer - you can never have too many test computers...
Matthew wrote:At this point I've got printf() and a whole lot of guessing to go on. Does anyone have any tips for dealing with this kind of situation? Is there some way of making any of the emulators "more realistic" so that I can get a feel for the problem in a debuggable environment?
Most emulators do some instructions for one emulated CPU, then switch to another emulated CPU and do some more instructions, then the next CPU, etc. For Bochs you can control how many emulated instructions are done on each CPU before the emulator switches to another CPU. For example (in "bochsrc.txt"):

Code: Select all

cpu: count=2, [b]quantum=1[/b], ips=400000000, reset_on_triple_fault=0
This tells Bochs to do one instruction on each CPU, which is bad for performance but makes it much more likely that race conditions and re-entrancy bugs will be detected. For Qemu I think you're screwed (I think Qemu spends 1 ms emulating one CPU, then another 1 ms emulating the next CPU, etc, and the chance of detecting any race conditions and re-entrancy bugs is a lot lot less because of this). For real computers CPUs execute instructions at the same time, so these types of bugs are much more likely to be detected; and some bugs (e.g. forgetting to use a "lock" prefix where it's necessary) will only effect a real computer.

For intermittent page faults, make sure that you're invalidating TLBs correctly. I've seen OSs that never invalidate TLBs at all that run find on Pentium and older CPUs (and emulators), but fail miserably (in random ways) on any modern real computer.

Finally, sometimes the best debugging tool is a pen and paper (and that squishy thing between your ears ;) ). Some bugs are almost impossible to debug using any other technique, because adding something like a "printf()" changes the timing and makes the bug disappear.


Cheers,

Brendan

Re: making the transition to real hardware

Posted: Tue Jul 21, 2009 10:29 pm
by Matthew
Come to think of it, I noticed a blank metal space on the back of some where a serial port would normally be found. So it may be there, but the case is covering it for some reason. I will have to see what I can do about that.

I did have a lot of success with Bochs in uncovering race condition bugs just like you described, and the same experience with QEmu. I will try experimenting with the quantum parameter and see if that helps any further.

I'm glad you mentioned TLBs though, because that was my first thought about the page faults. It just so happens that many of them seem to occur in a specific spot: my ATAPI driver maps a buffer page for the purpose of reading a sector into it -- and after sending the ATAPI command, I put the process to sleep in await of an IRQ. After the process is woken up, it reads a series of words from the I/O port (PIO mode) but there is a page fault when it tries to access the buffer page.

Now I currently invalidate the page (invlpg) on the local CPU when mapping and unmapping virtual memory. And I don't have "threads", so every process has its own address space (but they all share the kernel page table mapping in high memory). And there is a global kernel lock. So I thought I could get by without implementing TLB-shootdown at this point. Even if the process goes to sleep on one CPU and is taken up by another CPU, it should flush the TLB upon executing the task-switch.

So while the page fault did seem like it could be the symptom of an invalid TLB (especially with the inconsistency) I could not imagine how, with the limitations of the current system, that could be caused.

Re: making the transition to real hardware

Posted: Wed Jul 22, 2009 6:53 am
by bewing
I am currently completely rewriting bochs. One of my intents is to make it much more realistic for a multi-CPU environment. I already know that there are a number of things that are not realistic about the current bochs code. But I am very interested in any situation where bochs/qemu work "right", and real hardware fails to match. Code snippets and/or quick descriptions of such bugs would be vastly appreciated. For example, I would really like to know what effect the IMCR had that you weren't expecting.

Re: making the transition to real hardware

Posted: Wed Jul 22, 2009 3:00 pm
by Matthew
I'm not so sure about that IMCR thing anymore. But I did discover that GRUB was reporting a memory map that included a 200MB region above the 4GB mark -- even though the machine only has 4GB RAM. This was screwing up my physical memory manager, so I included an explicit check against memory above 4GB and that problem went away.

I'm currently working on hooking up a serial port connection between some machines.

Re: making the transition to real hardware

Posted: Wed Jul 22, 2009 4:26 pm
by Matthew
I'm getting much further now that my physical memory manager is not so screwed up. I still got a mysterious page fault while I had a lot of logging going through the serial port but I toned it down a bit and have been unable to reproduce the issue. Oh well.

One problem I did have is with the interpretation of the Intel Multiprocessing Specification tables. Even while examining the output of the parser by hand, I am confused about the correct interpretation. I was able to toggle my ACPI driver back on, and that has a much easier view of the configuration:

ACPI Interrupt Override: Bus: 0 SourceIRQ: 0 GlobalIRQ: 2
ACPI Interrupt Override: Bus: 0 SourceIRQ: 9 GlobalIRQ: 9

ACPI says that the timer IRQ0 is remapped to Global System Interrupt 2, which is fairly common, and the ACPI control interrupt is mapped to GSI 9. This is great, just like QEMU. But the Intel MP tables tell a more confusing story:

IO APIC-id: 8 version: 20 address: FEC00000
IO interrupt type: 0x03 flags: 0x05 source: (bus: 6 irq: 0) dest: (APIC: 8 int: 0)
IO interrupt type: 0x00 flags: 0x00 source: (bus: 6 irq: 1) dest: (APIC: 8 int: 1)
IO interrupt type: 0x00 flags: 0x00 source: (bus: 6 irq: 0) dest: (APIC: 8 int: 2)
.
.
.
Local interrupt type: 3 flags: 5 source: (bus: 6 irq: 0) dest: (APIC: FF int: 0)
Local interrupt type: 1 flags: 5 source: (bus: 6 irq: 0) dest: (APIC: FF int: 1)

So the first entry is a vectored interrupt from the PIC (active high, "Reserved" flag set) and it is mapped to INTIN0 on the first and only IOAPIC. But does this entry mean anything if I have the 8529A PIC masked? The third entry looks more typical of what I find from QEMU, where the ISA bus (id: 6) has IRQ0 mapped to INTIN2 also.

I'm really not sure what to make of the local interrupt table. Is it telling me that the 8259A PIC is also wired to the LAPICs and asserting INTIN0 to them directly?

(edit: notably, things seem to go much more smoothly when I just ignore that first IO interrupt entry and proceed on the assumption that it is irrelevant because the PIC is masked)

Another curious thing is that ACPI reports 8 processors: 4 enabled, 4 disabled. Processors 1 and 5 shared APIC ID 0, 2 and 6 shared APIC ID 1, etc. Supposed to be a Core2 Quad. Is this evidence of hyperthreading?

Re: making the transition to real hardware

Posted: Thu Jul 23, 2009 12:01 am
by Brendan
Hi,
Matthew wrote:ACPI says that the timer IRQ0 is remapped to Global System Interrupt 2, which is fairly common, and the ACPI control interrupt is mapped to GSI 9. This is great, just like QEMU. But the Intel MP tables tell a more confusing story:

IO APIC-id: 8 version: 20 address: FEC00000
IO interrupt type: 0x03 flags: 0x05 source: (bus: 6 irq: 0) dest: (APIC: 8 int: 0)
IO interrupt type: 0x00 flags: 0x00 source: (bus: 6 irq: 1) dest: (APIC: 8 int: 1)
IO interrupt type: 0x00 flags: 0x00 source: (bus: 6 irq: 0) dest: (APIC: 8 int: 2)
.
.
.
Local interrupt type: 3 flags: 5 source: (bus: 6 irq: 0) dest: (APIC: FF int: 0)
Local interrupt type: 1 flags: 5 source: (bus: 6 irq: 0) dest: (APIC: FF int: 1)

So the first entry is a vectored interrupt from the PIC (active high, "Reserved" flag set) and it is mapped to INTIN0 on the first and only IOAPIC. But does this entry mean anything if I have the 8529A PIC masked? The third entry looks more typical of what I find from QEMU, where the ISA bus (id: 6) has IRQ0 mapped to INTIN2 also.
Originally the PIC had an interrupt request line (#INTR) connected directly to a pin on the CPU, which is used when the PIC receives an IRQ and wants to tell the CPU about it. When they added APICs this #INTR line is either routed through a gate (the IMCR) to enable/disable it; or connected to one of the APICs (so that when the PIC wants to tell the CPU about an IRQ it actually tells an APIC, and the APIC tells the CPU that the PIC wants to send details for the IRQ). This is partly so that you can use the PICs and the I/O APICs at the same time (e.g. some IRQs enabled in the PIC and other IRQs enabled in the I/O APIC).

AFAIK, for Bochs the PIC's #INTR line is connected to I/O APIC input #0 and local APIC input #0.

Note: If you disable all IRQs in the PIC, then the PIC can still generate a spurious interrupt (a fake IRQ 7 from the master PIC, or a fake IRQ 15 from the slave PIC).
Matthew wrote:Another curious thing is that ACPI reports 8 processors: 4 enabled, 4 disabled. Processors 1 and 5 shared APIC ID 0, 2 and 6 shared APIC ID 1, etc. Supposed to be a Core2 Quad. Is this evidence of hyperthreading?
It's evidence that you should ignore entries in the MADT that are disabled. Typically they're only there so that the BIOS can use fixed size area instead of having to calculate how much space it needs for the ACPI tables - if you had a Core 2 Duo you'd probably have 2 enabled entries and 6 disabled entries, and if you had a CPU with eight logical CPUs they'd all be enabled (and in all cases the size of the "ACPI reclaimable area" and the addresses of everything in it would be exactly the same).

For a quad core CPU with hyperthreading (with hyperthreading "enabled"), ACPI should put the first logical CPU in each core at the beginning of the list, and then list the other logical CPU/s in each core at the end of the list; so you'd get:
  • MADT entry 0: APIC ID 0 (core 0, first logical CPU in core)
    MADT entry 1: APIC ID 2 (core 1, first logical CPU in core)
    MADT entry 2: APIC ID 4 (core 2, first logical CPU in core)
    MADT entry 3: APIC ID 6 (core 3, first logical CPU in core)
    MADT entry 4: APIC ID 1 (core 0, second logical CPU in core)
    MADT entry 5: APIC ID 3 (core 1, second logical CPU in core)
    MADT entry 6: APIC ID 5 (core 2, second logical CPU in core)
    MADT entry 7: APIC ID 7 (core 3, second logical CPU in core)
For a quad core CPU that supports hyperthreading but with hyperthreading disabled, the first 4 entries would look the same (same APIC IDs as above) and the other 4 entries would be disabled; which isn't what you're seeing (you're getting APIC IDs 0, 1, 2, 3 and not 0, 2, 4, 6).


Cheers,

Brendan