Page 1 of 2
VESA is too slow !!!
Posted: Thu Nov 04, 2004 8:15 am
by Cemre
VESA is too slow... >:(
i can only make 66 mb/sec to the framebuffer,
(1024*768*32BPP) >>> about 25 fps.
and the AGP thread here
"A small tutorial in AGP speed setup"
doesn't speed me up !
can someone help me ?
PS: is there any documentation about nvidia cards on the internet?
PPS: fps test is being done using classic memset
mov ecx , 1024*768
mov edi , GUI_FRAMEBUFFER
xor eax , eax
cld
rep stosd
i tried 64bit memset version using fpu too... it is not faster...
( not much slower either ).
Re:VESA is god d*mn slow !!!
Posted: Thu Nov 04, 2004 8:43 am
by distantvoices
First, slow down a bit.
Take into consideration, that your very eyes perceive everything from 25 fps upwards as continuous movement. Ask any film maker, they tell you. It is because of aftermaths of pictures remaining in the cells carried by what is called ... netskin? waitamin -->retina (for gods sake there are dictionnaires) they interleave with incoming data and produce the impression of movement. Well - yes, the cells in your retina take their time to produce pictures. Sometimes, if a picture remains still and has a flat color sheme, they simply go to power save mode too.
in short: everything faster than 24-26 fps can be considered luxury. Human brain wouldna perceive the difference. D'ya need more for a simple gui? or even for rendering movies?
Then, think about this: speak slow and think quick.
stay safe
Re:VESA is god d*mn slow !!!
Posted: Thu Nov 04, 2004 8:54 am
by Solar
BI, while what you say is true, that's for
video playback. It's also sufficient for hobbyist OS 2D desktops.
In
games, however, higher framerates
do make sense, as e.g. the movement of your Doom3 opponent isn't hacked into 25 distinct positions per second. That
does make a difference to the human eye!
That being said: VESA mode hasn't been designed to allow high framerates. 66 MB/sec. is 50% of the maximum PCI bandwith, I'd consider that quite good already.
If you want to get
much faster, you'd have to do "real" 3D - which would require you to write drivers for every 3D chipset you want to support. (That's why VESA is so handy: It's a standard, GPU ABIs aren't.)
As for "is there info on...", that's what Google excels at.
Re:VESA is god d*mn slow !!!
Posted: Thu Nov 04, 2004 9:25 am
by distantvoices
oh, the games... As I don't play 3d egoshooters out of no sense for that kind of games, I lack the necessary ... Verst?ndnis? time for the dictionnaire: sympathy for the need of higher frame rates. Mark: not comprehension - sympathy. *gg* Yes, I *am* biased.
For my own avereage 2d desktop hobbyist OS, 24 frame rates will wonderfully suffice.
Remember ELITE on good old c64?
Re:VESA is god d*mn slow !!!
Posted: Thu Nov 04, 2004 11:29 am
by ASHLEY4
People want the highist screen size, and the highist fps,
it just does not go.
For a good test of fps of all vesa mode and screen sizes try this program, only tested on 98, may need dpmi for dos.
http://board.flatassembler.net/viewtopi ... 89&start=0
Go to post by "ASHLEY4", its called "veastest.zip".
\\\\||////
(@@)
ASHLEY4.
Batteries not included,Some assembly required.
Re:VESA is god d*mn slow !!!
Posted: Fri Nov 05, 2004 1:34 am
by Solar
beyond infinity wrote:
oh, the games... As I don't play 3d egoshooters out of no sense for that kind of games...
I don't either. But I like 3D flight simulators, 3D realtime and turn-based strategy games and the like.
Re:VESA is god d*mn slow !!!
Posted: Fri Nov 05, 2004 5:25 am
by Xardfir
For starters, here are the bandwidth you can get:
PCI / AGP 1x has a base speed of 266MB per second
(4 bytes * 66Mhz)
AGP 2x 528MB (8 bytes * 66Mhz)
AGP 4x 1GB (16 * 66Mhz)
AGP 8x 2GB (32 * 66Mhz)
Are using memset in a C called 'if' loop or straight from assembler? If so - everytime you call memset, speed will be shot to hell due to function calling overheads - I always do low level graphics in asm. A good book is '3D game programming by Andrew LaMothe' (Waite Group Press)
It's 16 bit dos, but provides a good insight into the things that clog up a graphics pipeline.
Your code has mov ecx, 1024*768.
should this be mov ecx, 1024*768*4?
That should make it 32 bit.
Another problem could be that some Video cards automatically default to 8-bit vga compatible transfers. Trying to write more than 8 bits at a time to video memory can result in fuzzy things happening. I'm not sure if VESA sets things up correctly.
Which version of VESA BIOS are you using?
I assume 2+ because of the linear frame buffer.
VESA support varies widely.
The info in the OS-FAQ is more updated than the info in the thread. Particularly the bit where it's recommended that you turn all devices with an AGP capability to the same speed. This is because the AGP 3.0 uses a point to point protocol and 'sideband addressing' which is external to the normal addressing scheme being produced by the AGP card / busmaster itself.
Different processors utilise bus differently. 64 bit floating point is useless in this regard as the AGP slot is only a 32 bit slot. Trying to force 64 bits down a 32 bit pipe gives 1 transfer every 3 clock cycles between your bus master controller and the AGP card.
In this regard AMD should be faster due to the faster external bus (internals make no difference except for 3d rendering which can't be done on the video card because there's not programming information about GPU's.)
That said, the only routines I use are to copy information from memory to the screen and from the screen using simple LOSD and STOSD - these were inspired by Minix of all things.
Blitting information from one area of the screen to another is performed by the VGA hardware rather than tying up the bus with a VGA to CPU to VGA roundabout. Video memory will never be faster than main memory when addressed over any expansion bus.
Hope this helps. Keep safe.
Re:VESA is god d*mn slow !!!
Posted: Fri Nov 05, 2004 7:56 am
by Cemre
For starters, here are the bandwidth you can get:
PCI / AGP 1x has a base speed of 266MB per second
(4 bytes * 66Mhz)
AGP 2x 528MB (8 bytes * 66Mhz)
AGP 4x 1GB (16 * 66Mhz)
AGP 8x 2GB (32 * 66Mhz)
this is exactly what i'm talking about... i care about the bus speed, not what human eye can see... i can only do 66MB/sec on the bus and that is JUST the memory write speed. if i were to write a game or sth, the fps would go even down-er because of rendering and game graphics handling codes. not everything is video memory writes in a game.
PS:
recently i was planning to use video framebuffer for keeping my "GUI process" data structures. think about it... you have at least 64 megs of video memory and 1024*768*32 only makes 3.5 megs. i thought that it was a waste of space. ( of course you can use the rest in 3d mode ). i had planned to use this remaining area for usual data but because it turned out to be too slow, i changed my plans. that was another reason why i was so angry before ( sorry for that "god d*mn" thing ).
PPS: i am using it from "direct" assembly, no C function exist anywhere ( for this test program )
mov ecx , 1024*768 ...
rep stosd NOT rep stosb...
rep stosd writes eax to memory... making it 32 bit...
PPPS: does cache of cpu be the cause of this in some way? i tried it with both caching enabled and disabled ( with page directory flags ), doesn't seem to make any change though. does MTRR's ( memory type range register ) cause this problem? i dont know how to use them ( mtrrs )?
Re:VESA is god d*mn slow !!!
Posted: Fri Nov 05, 2004 9:54 am
by KieranFoot
I'm also looking for nvidia docs, and havent found any yet, when i do i'll be sure to let u know...
Re:VESA is god d*mn slow !!!
Posted: Fri Nov 05, 2004 10:08 am
by mystran
beyond infinity wrote:
For my own avereage 2d desktop hobbyist OS, 24 frame rates will wonderfully suffice.
Yes. It also makes sure that your OS will feel like hobbyist OS.
Honestly, the frame rate you need to perceive something as continuous depends wholly on the speed of the movement you are going to have. If you want some moving relatively fast to seem continuous, you need high frame-rate. Movies and television get away with less frames by using motion blur.
Re:VESA is god d*mn slow !!!
Posted: Fri Nov 05, 2004 10:28 am
by Curufir
How many times do you actually need to write the whole screen anyhow. For a desktop you're only ever writing a few small bitmaps /frame so you can get excellent fps using the framebuffer (Even at 66Mb/s).
If you're trying to make a game that needs to update the whole screen/frame (Eg Something 3D) then you really need to start using the hardware acclerator functions.
Hell, have you tried running Quake at 1024*768 using VESA and software rendering? The FPS is horrible, and those guys are damn good at what they do.
Re:VESA is god d*mn slow !!!
Posted: Fri Nov 05, 2004 11:06 am
by Pype.Clicker
KieranFoot wrote:
I'm also looking for nvidia docs, and havent found any yet, when i do i'll be sure to let u know...
Nvidia specs are not avl. however, you have BeOS and Linux open-source drivers that you can easily reverse-engineer to get info about 2D basic hardware accelerations (bitblt, filling, etc).
Re:VESA is god d*mn slow !!!
Posted: Fri Nov 05, 2004 1:54 pm
by distantvoices
@mystran: and I'm happy with it, you won't believe it.
@cemre: just edit the title of your first post and get rid of the ... offending ... words. And sorry, I've misunderstood your intentions.
Re:VESA is god d*mn slow !!!
Posted: Fri Nov 05, 2004 8:08 pm
by mystran
Curufir wrote:
How many times do you actually need to write the whole screen anyhow. For a desktop you're only ever writing a few small bitmaps /frame so you can get excellent fps using the framebuffer (Even at 66Mb/s).
Actually, pretty much the only optimizations that I would wish from a 2d driver would be screen-to-screen bitblt and offscreen bitmaps. If I can have both with alpha-support, then I'm extra happy.
Offscreen bitmaps are nice because when you do a blit from the offscreen bitmap to screen, you can get a whole window appear "at once". If you have lots of memory on card, and alpha support in blit, then you can also do nice composition similar to Carbon fast, and things like moving a window are rather trivial: just compose again.
Ofcourse, screen-to-screen blit alone is nice too, because moving window with it's content can be done faster if you can have the card first move the window, then just pain the newly exposed region.
In theory having fills is also nice, but with blit you can always fill a small area and copy that around. And if you do that in one buffer, you can then use the original area and the newly filled area together for the next blit. Sure it's a
bit more work to code.
Without at least blit, it's hard to get windows move around nice and fast, and without offscreen bitmaps you have to draw into your primary buffer which leads to partially drawn content becoming visible.
Re:VESA is god d*mn slow !!!
Posted: Fri Nov 05, 2004 9:52 pm
by Brendan
Hi,
Make sure the CPU's caches are configured correctly. This implies setting video display memory as write-combining (or un-cached if the CPU doesn't support write combining), and possibly doing the same for video buffers in main memory (depending on how they are used).
It's also possible to have a double buffer and a bitmap, where each bit corresponds to a dword in the double buffer (which would be a pixel in 32 bpp, 4 pixels for 256 colour, etc). When a pixel is changed (note changed != set) in the double buffer you set the corresponding bit for the changed dword in the bitmap. When the double buffer is blitted to display memory the bitmap is used to make sure only changed dwords are sent. For 1024 * 768 * 16 bpp you'd need 1536 Kb for the double buffer and another 48 Kb for the bitmap.
For 32 bpp it's better to use the highest bit of each pixel instead of a bitmap, as this halves the amount of memory fetches the CPU does when blitting. The same can be done for 15 bpp, where the highest bit of every second pixel is used.
Then it's good to optimize the code that does the drawing. For e.g. I've seen people write code to draw horizontal lines that repeatedly calls "setPixel(x, y, colour)", where setPixel calculates the offset from x & y each time. Instead it's better to have a different routine for each colour depth so that "rep stos" can be used.
Cheers,
Brendan