Doublebuffering
Posted: Sun Jul 01, 2007 3:55 am
I'm writing a driver for VESA. In your opinion, must I insert the doublebuffering in my driver? Or doublebuffering is used only by applications?
Dirty rectangles would help.pcmattman wrote:ATM, my VESA driver automatically does all the double buffering it needs to do, for the entire screen. How else can you get the performance you need?
Note: dirty rectangles + a double buffer would probably be better.
Code: Select all
/* Prefetch data for first 4 cache lines in first page */
preload(address); // Preload for TLB miss
precache(address + cacheLineSize);
precache(address + cacheLineSize * 2);
precache(address + cacheLineSize * 3);
/* calculate control variables, etc here while the CPU is getting stuff from RAM */
/* Do all pages except the last one */
for(page = 0; page <lastPage - 1; page++) {
for(cacheLine = 0; cacheLine < lastCacheLine - 4; cacheLine++) {
/* Do all cache lines except the last 4 */
sendDataToVideo(address);
cacheFlush(address);
prefetch(address + cacheLineSize * 4);
address += cacheLineSize;
}
/* Do the remaining 4 cache lines in this page */
sendDataToVideo(address);
cacheFlush(address);
preload(address + cacheLineSize * 4); // Preload for TLB miss
address += cacheLineSize;
sendDataToVideo(address);
cacheFlush(address);
prefetch(address + cacheLineSize * 4);
address += cacheLineSize;
sendDataToVideo(address);
cacheFlush(address);
prefetch(address + cacheLineSize * 4);
address += cacheLineSize;
sendDataToVideo(address);
cacheFlush(address);
prefetch(address + cacheLineSize * 4);
address += cacheLineSize;
/* Do the last page */
for(cacheLine = 0; cacheLine < lastCacheLine - 4; cacheLine++) {
/* Do all cache lines except the last 4 */
sendDataToVideo(address);
cacheFlush(address);
prefetch(address + cacheLineSize * 4);
address += cacheLineSize;
}
/* Do the remaining 4 cache lines without prefetching more data */
sendDataToVideo(address);
cacheFlush(address);
address += cacheLineSize;
sendDataToVideo(address);
cacheFlush(address);
address += cacheLineSize;
sendDataToVideo(address);
cacheFlush(address);
address += cacheLineSize;
sendDataToVideo(address);
cacheFlush(address);
}
Thank you! I already implemented the setting of AGP transfer rate, the setting of FastWrite and SideBand Addressing, the using of MMX/SSE/SSE2 (it's very strange that in bochs using MMX is faster than using SSE), I must only implement prefetching and write-combining. How can I do this?Brendan wrote:For "transfer management" the idea is to minimise the number of transfers across the bus by doing large transfers instead of small ones (e.g. transfer 64-bytes at a time, rather than doing 64 single byte transfers) and improving the speed of each transfer (e.g. increasing/setting the AGP transfer rate).
For information about setting the AGP transfer rate, see this wiki page.
For increasing the size of the transfer across the bus, you could use "write combining" (if supported), and MMX/SSE (if supported) or 32-bit transfers.
For write-combining, first check if the CPU supports it. Pentium Pro and later do support it (I'm not sure when AMD started supporting it though). To enable it you need to find out where display memory is and then set that area to "write-combining" using the MTRRs (typically for LFB it means setting an unused variable range MTRR to cover the area).MarkOS wrote:Thank you! I already implemented the setting of AGP transfer rate, the setting of FastWrite and SideBand Addressing, the using of MMX/SSE/SSE2 (it's very strange that in bochs using MMX is faster than using SSE), I must only implement prefetching and write-combining. How can I do this?
Have you some code for setting an area to write-combining using MTRRs?Brendan wrote:For write-combining, first check if the CPU supports it. Pentium Pro and later do support it (I'm not sure when AMD started supporting it though). To enable it you need to find out where display memory is and then set that area to "write-combining" using the MTRRs (typically for LFB it means setting an unused variable range MTRR to cover the area).
I normally don't give code, as people that do "cut & paste OS development" end up with a patchwork they don't understand and I'd rather help people write their own code (that they do understand).MarkOS wrote:Have you some code for setting an area to write-combining using MTRRs?Brendan wrote:For write-combining, first check if the CPU supports it. Pentium Pro and later do support it (I'm not sure when AMD started supporting it though). To enable it you need to find out where display memory is and then set that area to "write-combining" using the MTRRs (typically for LFB it means setting an unused variable range MTRR to cover the area).
Thank you, Brendan!Brendan wrote:I normally don't give code, as people that do "cut & paste OS development" end up with a patchwork they don't understand and I'd rather help people write their own code (that they do understand).
BTW there's a little information (and a little code) in this post about enabling MTRRs. I'd still recommend reading and understanding the relevant part of Intel's manual though...