Oh, you mean 16 colors (4 bits) and not 16 bits (64k colors)? Then alexfru is right.
I'd highly recommend against that mode. But If you insist, the only way to make it fast is to have a copy of the screen in RAM. Then
- when you modify a pixel, you create a list of boxes, areas that has changed. Updating a window can mark the entire window area, no need to check every pixel
- on vertical retrace, you "flush" the screen, you iterate through that list 4 times: first you switch to plane 0, then you update every pixel's 1st bit in the list by copying the area from the RAM screen to the VGA screen. Then you switch to plane 1, and copy the 2nd bits, etc.
- when done, you clear the list
This minimizes the number of plane switches (which is slow), and updates video RAM when it is not used (during a vertical retrace). Also you can calculate the screen offsets of the boxes once and reuse them for all 4 plane writes.
If I were you, I'd rather consider using SVGA mode at least (or even better
VESA). If you don't use the VESA 2.0 extensions, some modes
have fixed mode values. and you can set them with
INT 10h/AH=0. But you really should use VESA (INT 10h/AH=4F) instead, that's more supported and also provides linear frame buffer access (no need for switch banks). Bank switch is necessary as the VGA has a 64k MMIO at 0xA0000, but a 640x480x16 (16 bits, not colors) resolution requires a little bit more than 600k of memory, so you have to map a certain 64k of that 600k at 0xA0000. Linear frame buffers use an address other than 0xA0000 (typically 0xE000000) and they are contiguous, meaning you can only access those from protected mode, but in return you can use the entire 600k as-is.
Cheers,
bzt