Page 2 of 2

Re:VESA is too slow !!!

Posted: Sun Nov 07, 2004 6:05 pm
by Xardfir
Although sounding good at first, using the frame buffer memory for more structures is actually a bad idea. The reason is, in order to update a window (either data or other structures) means you have to send all the information across the data bus, putt putt putt, slug slug slug - you get the idea.
Having the windows and their data in main memory allows you to fiddle with everything at max speed, composite an actual screen and then blit that one screen across the bus. This can easily achieve fps of 60+ or more if you tie updates to the system clock. (Just that one screen - nothing else).

The 64/128/256 megs of memory you have in your video card are mainly there for texturing or storing triangles produced by the vertex engines of your VPU. It was never designed for general purpose data storage. In games like Quake etc, once textures are loaded, very little info is sent to the graphics card. This is the entire point of an accelerator - get the GPU to do all the dirty work, leave the processor for AI.

My first attempts at any graphics (totally separate from OS development) was with Turbo C and 16 bit assembler. 60fps is easily achievable even without linear framebuffer. The software graphics pipeline is the killer, not the hardware. This includes Quake et al. Even through windows directX, if the system used to produce your display lists aren't set up you'll never see more than 24fps no matter how good your drivers or hardware. NVIDIA / ATI cards to my knowledge currently support quad buffering in hardware.

I agree with Brendan's point about caches not set up properly. When using a large VESA frame, those caches will only spit out their data when forced to, all those stosd's from the memset are probably piling up in the cache and only being sent to the video buffer every 4K or so.

Stay safe.

Re:VESA is too slow !!!

Posted: Wed Nov 10, 2004 4:43 pm
by Dreamsmith
Xardfir wrote:The 64/128/256 megs of memory you have in your video card are mainly there for texturing or storing triangles produced by the vertex engines of your VPU. It was never designed for general purpose data storage. In games like Quake etc, once textures are loaded, very little info is sent to the graphics card. This is the entire point of an accelerator - get the GPU to do all the dirty work, leave the processor for AI.
Indeed, if you're doing something that requires more than 25 FPS, you almost certainly don't want to be blitting buffers to the screen, you want to be sending commands to the GPU.

Re:VESA is too slow !!!

Posted: Wed Nov 10, 2004 6:12 pm
by Slasher
Could you (Brendan and co) explain what you mean by Setting up the Cache? and where do you setup th cache?
Thanks

Re:VESA is too slow !!!

Posted: Thu Nov 11, 2004 2:29 am
by Brendan
Hi,
Code Slasher wrote: Could you (Brendan and co) explain what you mean by Setting up the Cache? and where do you setup th cache?
A physical address may use one of these methods of caching:
Strong uncachable
Uncachable
Write combining
Write through
Write back
Write protected

Caching is controlled by several different things. First there's the CD (cache disable) bit in CR0, which is like a master switch for all caching, and the NW bit in CR0 that determines if writes go directly to memory or not. Normally CR0 should be configured to enable caches (CD clear) and enable write-back/write-through caching (NW clear).

The next level of caching is the MTRRs (Memory Type Range Registers). These should be configured properly by the BIOS, and are used to control default caching type for physical memory ranges.

There's also several flags in page directory entries and page table entries, which allows caching to be configured for each individual page. For older CPUs there's PCD and PWT which control caching for the page (or page table). These are similar to the CD and NW flags in CR0. Newer CPUs have an additional PAT flag in page table entries, which combines with the older caching flags to form a 3 bit index into the "Page Attribute Table". The CPU uses the 3 bit index to find the PAT entry, which determines the type of caching to use. The page attribute table is configured via MSRs and allows more specific types of caching to be used (e.g. write combining). The PAT defaults to values that are compatible with the older use of the first 2 flags (PCD & PWT). Also there's PCD and PWT flags in CR3 that control caching for the entire address space.

Table 10-7 in the Intel System Programmer's Guide shows how different MTRR values and PAT values combine to determine the effective type of caching actually used by the CPU.

On top of all of this there's instructions for explicit cache control: INVD, WBINVD, WBINVD, PREFETCHh and CLFLUSH.

None of the above applies to TLB caches though (see INVLPG, PGE flag in CR4, G flag in page directory & page table entries).

Once all of the above is sorted out there's CPU specific cache configuration (if any). Cyrix CPUs are especially difficult to get right (see for e.g. http://www.computersmadeasy.com/Install/6X86OPT.TXT). For AMD's K6 CPUs it's good to enable write-back caching.


Cheers,

Brendan