I just gave your code a better look. The routine that prints '!' may not always loop in my program.
Ok, to make my code more understandable this is what I want to do (the bigger picture)...
>kernel loads demo into memory (demo is the code that sets the 1C handler and prints '!' or whatever)
>kernel jumps to demo in memory
>Demo sets INT 1C to *its* handler
>Demo starts running *its* code IE: '!' or whatever it may be (could be anything, its an application/task)
>keeps looping the code if needed (its an application/task, who knows what it may do)
>INT 1C handler called?
>If so it restores the original 1C handler and saves Demo programs registers (like a context switch)
>jumps back to kernel
>kernel goes to next task.....
Once the kernel loads the Demo program back into memory it will restore its context and
go back into the code where it left off.
Basicly I want to put the context switching in the applications so that the kernel does very
little work of its own to do this. All this must be done so that it can run on a 8086++.
I know this sounds like a lot of overhead for the code but I have my reasons

So instead of having the timer handler and context switch done inside the kernel, it's all done
inside the applications themselfs. Then all the kernel must do is just loop the tasks like this...
redo:
call task1 ;run task1 until timer routine in code tells it to stop, if stopped- saves its TCB...
call task2 ;reloads task2 TCB, runs until timer routine in code tells it to stop, ect...
call task3 ;and so on...
jmp redo
Each task is kindof like a DOS TSR except that each time on re-entry it starts where it left off, like
a 386 task using tss's would do. Normally a TSR would just run the code and exit,
starting at the codes start. This is fine if I wanted to do a cooperative multitasking kernel but I want
preemtive and I want ALL the context switching done inside the task its self.
The Task Context Block TCB would be inside the code its self and relative to ONLY its self.
Each task this way only has to worry about its own TCB and not others. The kernel worries about
nothing since it does almost nothing in the first place

Any ideas?
The header of the task could be as little as 256 bytes (not including its stack). The TCB would
take about 16 words, AX, BX, CX, ES, ect... And the rest would be the timer/switch code of cource.
Has this been done before?
Is it a bad idea?
Thanks.