Hi,
For NASM; "%include" works a bit like one file is cut & pasted into another file. It doesn't do anything else, like remembering if you were using 16-bit before the "%include" and automatically switching back to 16-bit after the "%include".
For your code; the assembler is told to generate 16-bit code at the start of "bootsect2.asm" and the assembler would generate 16-bit code up until it reaches the "%include "boot/32bit_print.asm"", and then (inside "boot/32bit_print.asm") the assembler is told to switch to 32-bit code. Then the assembler continues using 32-bit code for "boot/disk2.asm" and "load_kernel:". The end result is that it crashes because CPU is executing code assembled for 32-bit while the CPU is in 16-bit real mode.
To solve this, just put "bits 16" at the end of "boot/32bit_print.asm".
I didn't look too hard, but there were a few other (unrelated) things I noticed...
Near the start of "bootsect.asm" you load SS, then do another instruction ("mov [BOOT_DRIVE], dl"), then load SP. This is a little dangerous because an IRQ can occur after you've set SS but before you set SP, causing the IRQ handler to corrupt random memory. For a worst case example, if the BIOS left SS:SP set to 0x4000:0x7E00 (in the middle of nowhere) then an IRQ handler can interrupt immediately after the "mov [BOOT_DRIVE], dl" and could end up using "SS:SP = 0x0000:0x7E00" and overwrite your code. To guard against this you shouldn't have anything between the "mov ss,cx" and the "mov sp,bp" (which will be fine on modern CPUs because of special hacks built into the CPU to disable IRQs for the instruction after SS is set). For ancient CPUs (8086) there's no special hack and you would have to do "cli" before changing SS and SP and then "sti" after; but for these CPUs you're going to crash anyway (you assume the CPU supports 32-bit without checking and these CPUs don't support 32-bit).
For floppy disk you should have a
BPB in the first sector because if you don't some poorly designed operating systems (Windows) complain that the disk is faulty and needs to be reformatted. If you do have a BPB, then you can use the "sectors per track" and "number of heads" fields in the BPB (instead of using your "SEC_COUNT" and "HEAD_COUNT" labels). This makes it easy to have (e.g.) a utility that generates disk images or a utility that formats floppy disks that sets up the BPB to suit the size of the floppy (e.g. 1440 KiB, 1680 KiB, 1200 KiB, ...), where the same boot code works for all the different floppy disk formats.
For NASM, labels without a colon can be dangerous. For example, if you want an instruction like "stosd" but there's a typo and you write "stosdd" instead, then NASM will think it's a label without a colon and you won't get any error, and then you'll spend hours trying to figure out where the bug is. To avoid this, NASM has a "warn on orphaned labels" option, where it warns you about labels that don't have a colon, so you'll get a warning for "stosdd" (but you have to use colons for labels).
For NASM, most directives have a normal version that does not have square braces (e.g. like "org 0x7C00" and "bits 16" and "section .text") that should almost always be used; plus a special lower-level internal version that should only ever be used in special macro that does have square braces (e.g. like "[org 0x7C00]" and "[bits 16]" and "[section .text]"). For some of these if you use the wrong version it breaks features (e.g. if you use "[section .text]" instead of "section .text" you'll break the "__SECT__" macro); and for others (e.g. "[org 0x7C00]" and "[bits 16]") there's currently no difference but it may break features in future versions of NASM.
Cheers,
Brendan