Note: I'm implementing hardware task switching because this is my first attempt - ergo, I can give no information on software task switching.
If you are changing DPL, then a TSS will be required (even for software task switching). This is certainly true for 32 bit protected, but in EM64T protected mode I think it still holds (although, IIRC, it won't allow hardware task switching?). A TSS is a task state segment. It stores the state of the processor for each task. There's a nice detailed description in the Intel manuals of what the various sections are. Basically the base TSS itself is 104 bytes long, and contains register states, etc. When it gets more complicated is
a) when you want to switch tasks that use the FPU - you then need to support the FXSAVE and FXRSTR instructions to save and restore the FPU registers on task switch. I'm not sure where these get saved, since I'm not quite there yet.
b) you want to start including an IO permissions map for each task. Again, I haven't quite got there yet (about all the multitasking I've implemented has been pulled out again, and was simply a round-robin of two threads with very little complexity - it was just for the fun of seeing it) but this involves setting a 16-bit field in the TSS to the address of a bitmap for the IO permissions.
A TSS also contains a backlink to the previous task (for hardware task switching of nested tasks) which, IIRC, is a descriptor index to a descriptor pointing to the TSS of that previous task.
In hardware task switching, most (not the FPU registers, by default) of the registers are saved in the TSS on a task switch. This includes CR3, so page directories are switched (and obviously the TLB flushed) when the task switch is performed - not sure how it works with PAE enabled, though (again, because it isn't relevant to me as of yet). A task switch can be performed by a call or long jump the TSS segment descriptor. It's up to you how you implement this in your scheduler.
EDIT2: Also, I guess that if you didn't want to have pre-emptive multitasking, that you could just alter the backlinks in the TSSes, and wait for each task to return.
EDIT3: Actually, that wouldn't be multitasking as such... But my point was that there was more than one way of switching the tasks, even in hardware task switching.
Hope this helped a little (and that someone more knowledgeable can both correct what I've said, and add to it

).
EDIT: Also, the TSS absolutely must be page aligned if you're using paging. Otherwise, the task switch will fail (and throw an invalid TSS (10? I think) exception).
EDIT3: The Linux v0.01 kernel task switching is pretty self explanatory, if you want to get a grip on what happens.