You realize Mike, that you don't need to disable Interrupts yourself, right?
Unless the gate in the IDT is a TRAP Gate, the processor itself will automatically inhibit interrupts. You can learn more about this from the Intel System Developer's Manuals.
If you are clever, you will need only one Task state segment. However, it all depends on the grand design of your microkernel.
How micro is your kernel? How engrained is the notion of everything being a seperate process?
Some people seem to forget that process' do not need to be in different address spaces or privileges to be considered a seperate process. In my current project (z3-apoc : a proof of concept), The core kernel contains the scheduler, low level ipc and memory management - these are all running in seperate threads - they are all considered seperate processes.
There is no 'Kernel' as such. There is a cooperating group of lowlevel processes, these together form the core kernel.
You only need one TSS as you can dynamically update it's contents at will. If you want to switch the stack that dpl#3 will use, when the TSS next comes into effect (ie: when you return from interrupt ...), just change the TSS' esp2 field. If the next task requires a different address space, then update the cr3 field in the TSS too.
As for speed, Save only when you have to save and save only what you must.
Your ISRs should be as small as possible and optimized for utmost speed, especially the one servicing the timer. You dont want to end up with a high interrupt latency.
Hope this helps,
~Zeii.