vesa. vga, graphics in general
Re: vesa. vga, graphics in general
interesting. Would that mean that other newer cards also support this? What about GF8800 ? Is there somewhere a database?
Re: vesa. vga, graphics in general
So we can then safely assume other nVidia cards have it as well. Now, if ATi would also support it (anyone care to check? :)), it could be a useable alternative to plain VESA VBE.Jeko wrote:NVIDIA GeForce 6600
JAL
Re: vesa. vga, graphics in general
There is a small problem. My NVIDIA card says that it supports VBE/AF (version 1 or 1.1 I don't remember), but I don't tried it, so there can be problems with the implementation.jal wrote:So we can then safely assume other nVidia cards have it as well. Now, if ATi would also support it (anyone care to check? ), it could be a useable alternative to plain VESA VBE.Jeko wrote:NVIDIA GeForce 6600
JAL
Rewriting virtual memory manager - Working on ELF support - Working on Device Drivers Handling
http://sourceforge.net/projects/jeko - Jeko Operating System
http://sourceforge.net/projects/jeko - Jeko Operating System
Re: vesa. vga, graphics in general
Yeah, I was too lazy to mention that in my previous post, but that thought did cross my mind. So, what are you waiting for, go try it out! *cough* :)Jeko wrote:There is a small problem. My NVIDIA card says that it supports VBE/AF (version 1 or 1.1 I don't remember), but I don't tried it, so there can be problems with the implementation.
JAL
Re: vesa. vga, graphics in general
I would advise Dev's to forget about VBE/AF, as i have looked into it and these too many problems.
Much better is to have the best speed optimized graphic functions, for this we would be best working together, to chip away at the functions.
Much better is to have the best speed optimized graphic functions, for this we would be best working together, to chip away at the functions.
Re: vesa. vga, graphics in general
I wish...Dex wrote:for this we would be best working together
Rewriting virtual memory manager - Working on ELF support - Working on Device Drivers Handling
http://sourceforge.net/projects/jeko - Jeko Operating System
http://sourceforge.net/projects/jeko - Jeko Operating System
Re: vesa. vga, graphics in general
Here is the test program, it runs on DexOS to use it get DexOS from my site and use the program in the below zip (FpsTest.dex).
When you run it you should get the FPS for the dumping to screen 800*600 * 4 (32bpp).
This is what you need to improve (so keep a note of it), in the source code you will find a .inc called "FPS.inc" this is the code you need to optimize .
EG: This code
Note there 24-bit and 32-bit functions because some old cards use 24-bit, so you need to have both implemented for vesa, but keep your offscreen buffer 32bpp.
Any ? just ask.
PS: You will need to get DexOS to use the program, also if you card does not suport 800*600 32bpp, you can mod the code to any res.
Also note anything goes to get the FPS.
When you run it you should get the FPS for the dumping to screen 800*600 * 4 (32bpp).
This is what you need to improve (so keep a note of it), in the source code you will find a .inc called "FPS.inc" this is the code you need to optimize .
EG: This code
Code: Select all
;----------------------------------------------------;
; BuffToScreen. ;Puts whats in the buffer to screen ;
;----------------------------------------------------;
BuffToScreen:
cmp [ModeInfo_BitsPerPixel],24
jne Try32
call BuffToScreen24
jmp wehavedone24
Try32:
cmp [ModeInfo_BitsPerPixel],32
jne wehavedone24
call BuffToScreen32
wehavedone24:
ret
;----------------------------------------------------;
; BuffToScreen24 32bpp ;
;----------------------------------------------------;
BuffToScreen32:
pushad
push es
mov ax,8h
mov es,ax
mov edi,[ModeInfo_PhysBasePtr]
mov esi,VesaBuffer
xor eax,eax
mov ecx,eax
mov ax,[ModeInfo_XResolution]
mov cx,[ModeInfo_YResolution]
mul ecx
mov ecx,eax
cld
cli
rep movsd
sti
pop es
popad
ret
;----------------------------------------------------;
; BuffToScreen24 24bpp ;
;----------------------------------------------------;
BuffToScreen24:
pushad
push es
mov ax,8h
mov es,ax
xor eax,eax
mov ecx,eax
mov ebx,eax ;ccc
mov ax,[ModeInfo_YResolution]
mov ebp,eax
lea eax,[ebp*2+ebp]
mov edi,[ModeInfo_PhysBasePtr]
mov esi,VesaBuffer
cld
.l1:
mov cx,[ModeInfo_XResolution]
shr ecx,2
.l2:
mov eax,[esi] ;eax = -- R1 G1 B1
mov ebx,[esi+4] ;ebx = -- R2 G2 B2
shl eax,8 ;eax = R1 G1 B1 --
shrd eax,ebx,8 ;eax = B2 R1 G1 B1
stosd
mov ax,[esi+8] ;eax = -- -- G3 B3
shr ebx,8 ;ebx = -- -- R2 G2
shl eax,16 ;eax = G3 B3 -- --
or eax,ebx ;eax = G3 B3 R2 G2
stosd
mov bl,[esi+10] ;ebx = -- -- -- R3
mov eax,[esi+12] ;eax = -- R4 G4 B4
shl eax,8 ;eax = R4 G4 B4 --
mov al,bl ;eax = R4 G4 B4 R3
stosd
add esi,16
loop .l2
sub ebp,1
ja .l1
pop es
popad
ret
Any ? just ask.
PS: You will need to get DexOS to use the program, also if you card does not suport 800*600 32bpp, you can mod the code to any res.
Also note anything goes to get the FPS.
- Attachments
-
- Fps.zip
- The program and source code (fasm)
- (52.59 KiB) Downloaded 125 times
Re: vesa. vga, graphics in general
Hi,
You'd need to do something like (for 32-bpp):
Next, I'd add these lines of code to your "BuffToScreen" routine:
With this small change you'll probably get many million frames per second. Of course if there was something actually modifying the data you'd need to implement support for dirty rectangles or something.
Using dirty rectanges can make a huge difference, but it can be complex to implement. I cheat. Instead I just have a flag for each horizontal line of pixels. For e.g. for 800 * 600 mode you'd need 600 flags, or 75 bytes (one bit per flag). In this case, for 32-bpp you'd do something like:
Of course there's no real reason that you couldn't use a flag like this for smaller areas. For e.g. you could have one flag for each pixel, or one flag for each group of 4 pixels, or something else.
Next, I get performance improvements by using an additional buffer. The basic idea is that for each dword you do something like:
This improves performance because accessing RAM is a lot faster than accessing video display memory (there's no PCI bus bandwidth limitation for RAM accesses). Of course you can do more than one dword at a time to hide some of the RAM access latencies - for e.g.:
Notes: This technique avoids some video display memory accesses that can't be avoided using any other technique - e.g. if you do something that sets a pixel to the same colour it used to be. I normally use this for 16 colour modes, where the colour of each pixel is spread across 4 different planes (one bit per pixel per plane), so that (for e.g.) if a cyan pixel is changed to white you only need to update 2 planes (the red and intensity planes) instead of 4 planes (because the blue and green planes don't change).
This does cost more memory though, so you might want to use the normal method on systems with low RAM and this method when there's plenty of RAM; or work out which method to use based on video buffer size and amount of RAM, for e.g.:
Basically, I'd end up with several different routines for each BPP (with/without change buffer, with/without SSE, etc), where initialization code selects which routine to use and code does "call [blitRoutine]". Then I'd have several versions of each routine that does the drawing - for e.g. several "drawRectangle()" routines (with/without change buffer, with/without SSE, etc) where initialization code would select which routine to use and code does "call [drawRectangleRoutine]"; and several "drawCircle" routines, several "drawFont" routines, several "drawLine" routines, several "drawBMP" routines, several "drawIcon" routines, etc.
Mostly, what I'm trying to say here is that you can make huge improvements to the performance of the blit routines by doing a little more work when drawing. You can't just look at the performance of the blit routine alone, you need to look at the performance of the drawing routines and the blit routines together. In other words, if you always blit everything regardless of whether it changed or not, then your code will always give bad performance, and there's no point measuring how bad the performance is when you know the code needs to be ripped out and replaced.
Cheers,
Brendan
This code has a bug - there's no guarantee that all pixels are visible. For example, the video display memory might be arranged as 1024 pixels wide where only the first 800 pixels are sent to the monitor. This is why the VBE mode information structure tells you the number of bytes per scanline.Dex wrote:Here is the test program, it runs on DexOS to use it get DexOS from my site and use the program in the below zip (FpsTest.dex).
You'd need to do something like (for 32-bpp):
Code: Select all
movzx ebx, word [ModeInfo_YResolution]
mov edx, [ModeInfo_PhysBasePtr]
movzx ebp,word [ModeInfo_XResolution]
movzx eax, word [ModeInfo_LinBytesPerScanLine]
mov esi, VesaBuffer
.nextLine:
mov ecx, ebp
mov edi, edx
rep movsd
add edx, eax
sub ebx, 1
ja .nextLine
Code: Select all
BuffToScreen:
cmp byte [bufferChangedFlag], 0 ;Did the buffer change since last time?
je .done ; no, skip it
; ** other stuff **
mov byte [bufferChangedFlag], 0
.done:
ret
Using dirty rectanges can make a huge difference, but it can be complex to implement. I cheat. Instead I just have a flag for each horizontal line of pixels. For e.g. for 800 * 600 mode you'd need 600 flags, or 75 bytes (one bit per flag). In this case, for 32-bpp you'd do something like:
Code: Select all
xor ebx, ebx ;ebx = current screen line
mov edx, [ModeInfo_PhysBasePtr]
movzx ebp,word [ModeInfo_XResolution]
movzx eax, word [ModeInfo_LinBytesPerScanLine]
mov esi, VesaBuffer
.nextLine:
btr dword [lineModifiedFlags], ebx ;Was the screen line modified?
jnc .skipLine ; no, skip it completely
mov ecx, ebp
mov edi, edx
rep movsd
.skipLine:
add edx, eax
inc ebx
cmp ebx, word [ModeInfo_YResolution]
jb .nextLine
Next, I get performance improvements by using an additional buffer. The basic idea is that for each dword you do something like:
Code: Select all
mov eax, [sourceBuffer + edx]
cmp [changeBuffer + edx],eax ;Has this dword changed?
je .skip ; no, don't need to update video display memory
mov [changeBuffer + edx],eax ;Update the change buffer
mov [videoDisplayMemory + edx],eax ;Update the video display memory
.skip:
add edx,4
Code: Select all
mov eax, [sourceBuffer + edx]
mov ebx, [sourceBuffer + edx + 4]
mov ecx, [sourceBuffer + edx + 8]
mov ebp, [sourceBuffer + edx + 12]
cmp [changeBuffer + edx],eax ;Has this dword changed?
je .skip1 ; no, don't need to update video display memory
mov [changeBuffer + edx],eax ;Update the change buffer
mov [videoDisplayMemory + edx],eax ;Update the video display memory
.skip1:
cmp [changeBuffer + edx + 4],ebx ;Has this dword changed?
je .skip2 ; no, don't need to update video display memory
mov [changeBuffer + edx + 4],ebx ;Update the change buffer
mov [videoDisplayMemory + edx + 4],ebx ;Update the video display memory
.skip2:
cmp [changeBuffer + edx + 8],ecx ;Has this dword changed?
je .skip3 ; no, don't need to update video display memory
mov [changeBuffer + edx + 8],ecx ;Update the change buffer
mov [videoDisplayMemory + edx + 8],ecx ;Update the video display memory
.skip3:
cmp [changeBuffer + edx + 12],ebp ;Has this dword changed?
je .skip4 ; no, don't need to update video display memory
mov [changeBuffer + edx + 12],ebp ;Update the change buffer
mov [videoDisplayMemory + edx + 12],ebp ;Update the video display memory
.skip4:
add edx,16
This does cost more memory though, so you might want to use the normal method on systems with low RAM and this method when there's plenty of RAM; or work out which method to use based on video buffer size and amount of RAM, for e.g.:
Code: Select all
if(videoBufferSize * 32 < totalRamSize) useNormalMethod();
else useChangeBufferMethod();
Mostly, what I'm trying to say here is that you can make huge improvements to the performance of the blit routines by doing a little more work when drawing. You can't just look at the performance of the blit routine alone, you need to look at the performance of the drawing routines and the blit routines together. In other words, if you always blit everything regardless of whether it changed or not, then your code will always give bad performance, and there's no point measuring how bad the performance is when you know the code needs to be ripped out and replaced.
Cheers,
Brendan
Re: vesa. vga, graphics in general
Thank Brendan, lots of helpfull info, ( as usual) i agree with all your points, i will add some of your suggestion to the FPS test code.
Thanks again.
Thanks again.