vesa. vga, graphics in general

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
albeva
Member
Member
Posts: 42
Joined: Thu Aug 21, 2008 8:31 pm

Re: vesa. vga, graphics in general

Post by albeva »

interesting. Would that mean that other newer cards also support this? What about GF8800 ? Is there somewhere a database?
jal
Member
Member
Posts: 1385
Joined: Wed Oct 31, 2007 9:09 am

Re: vesa. vga, graphics in general

Post by jal »

Jeko wrote:NVIDIA GeForce 6600
So we can then safely assume other nVidia cards have it as well. Now, if ATi would also support it (anyone care to check? :)), it could be a useable alternative to plain VESA VBE.


JAL
User avatar
Jeko
Member
Member
Posts: 500
Joined: Fri Mar 17, 2006 12:00 am
Location: Napoli, Italy

Re: vesa. vga, graphics in general

Post by Jeko »

jal wrote:
Jeko wrote:NVIDIA GeForce 6600
So we can then safely assume other nVidia cards have it as well. Now, if ATi would also support it (anyone care to check? :)), it could be a useable alternative to plain VESA VBE.


JAL
There is a small problem. My NVIDIA card says that it supports VBE/AF (version 1 or 1.1 I don't remember), but I don't tried it, so there can be problems with the implementation.
Rewriting virtual memory manager - Working on ELF support - Working on Device Drivers Handling

http://sourceforge.net/projects/jeko - Jeko Operating System
jal
Member
Member
Posts: 1385
Joined: Wed Oct 31, 2007 9:09 am

Re: vesa. vga, graphics in general

Post by jal »

Jeko wrote:There is a small problem. My NVIDIA card says that it supports VBE/AF (version 1 or 1.1 I don't remember), but I don't tried it, so there can be problems with the implementation.
Yeah, I was too lazy to mention that in my previous post, but that thought did cross my mind. So, what are you waiting for, go try it out! *cough* :)


JAL
User avatar
Dex
Member
Member
Posts: 1444
Joined: Fri Jan 27, 2006 12:00 am
Contact:

Re: vesa. vga, graphics in general

Post by Dex »

I would advise Dev's to forget about VBE/AF, as i have looked into it and these too many problems.
Much better is to have the best speed optimized graphic functions, for this we would be best working together, to chip away at the functions.
User avatar
Jeko
Member
Member
Posts: 500
Joined: Fri Mar 17, 2006 12:00 am
Location: Napoli, Italy

Re: vesa. vga, graphics in general

Post by Jeko »

Dex wrote:for this we would be best working together
I wish...
Rewriting virtual memory manager - Working on ELF support - Working on Device Drivers Handling

http://sourceforge.net/projects/jeko - Jeko Operating System
User avatar
Dex
Member
Member
Posts: 1444
Joined: Fri Jan 27, 2006 12:00 am
Contact:

Re: vesa. vga, graphics in general

Post by Dex »

Here is the test program, it runs on DexOS to use it get DexOS from my site and use the program in the below zip (FpsTest.dex).
When you run it you should get the FPS for the dumping to screen 800*600 * 4 (32bpp).
This is what you need to improve (so keep a note of it), in the source code you will find a .inc called "FPS.inc" this is the code you need to optimize .
EG: This code

Code: Select all

 ;----------------------------------------------------;
 ; BuffToScreen.  ;Puts whats in the buffer to screen ;
 ;----------------------------------------------------;
BuffToScreen:
	cmp   [ModeInfo_BitsPerPixel],24
	jne   Try32
	call  BuffToScreen24
	jmp   wehavedone24
Try32:
	cmp   [ModeInfo_BitsPerPixel],32
	jne   wehavedone24
	call  BuffToScreen32 
wehavedone24:
	ret

 ;----------------------------------------------------;
 ; BuffToScreen24                               32bpp ;
 ;----------------------------------------------------;
BuffToScreen32:
	 pushad
	 push  es
	 mov   ax,8h
	 mov   es,ax
	 mov   edi,[ModeInfo_PhysBasePtr]
	 mov   esi,VesaBuffer 
	 xor   eax,eax
	 mov   ecx,eax	
	 mov   ax,[ModeInfo_XResolution]
	 mov   cx,[ModeInfo_YResolution]
	 mul   ecx
	 mov   ecx,eax	
	 cld
	 cli
	 rep   movsd
	 sti
	 pop   es
	 popad
	 ret

 ;----------------------------------------------------;
 ; BuffToScreen24                               24bpp ;
 ;----------------------------------------------------;
BuffToScreen24:
	 pushad
	 push  es
	 mov   ax,8h
	 mov   es,ax
	 xor   eax,eax
	 mov   ecx,eax
	 mov   ebx,eax ;ccc
	 mov   ax,[ModeInfo_YResolution]
	 mov   ebp,eax
	 lea   eax,[ebp*2+ebp]
	 mov   edi,[ModeInfo_PhysBasePtr]
	 mov   esi,VesaBuffer  
	 cld
.l1:
	 mov   cx,[ModeInfo_XResolution]
	 shr   ecx,2	     
.l2:
	 mov   eax,[esi]	 ;eax = -- R1 G1 B1
	 mov   ebx,[esi+4]	 ;ebx = -- R2 G2 B2
	 shl   eax,8		 ;eax = R1 G1 B1 --
	 shrd  eax,ebx,8	 ;eax = B2 R1 G1 B1
	 stosd

	 mov   ax,[esi+8]	 ;eax = -- -- G3 B3
	 shr   ebx,8		 ;ebx = -- -- R2 G2
	 shl   eax,16		 ;eax = G3 B3 -- --
	 or    eax,ebx		 ;eax = G3 B3 R2 G2
	 stosd

	 mov   bl,[esi+10]	 ;ebx = -- -- -- R3
	 mov   eax,[esi+12]	 ;eax = -- R4 G4 B4
	 shl   eax,8		 ;eax = R4 G4 B4 --
	 mov   al,bl		 ;eax = R4 G4 B4 R3
	 stosd

	 add esi,16
	 loop  .l2

	 sub ebp,1
	 ja .l1

	 pop   es
	 popad
	 ret	
Note there 24-bit and 32-bit functions because some old cards use 24-bit, so you need to have both implemented for vesa, but keep your offscreen buffer 32bpp.
Any ? just ask.

PS: You will need to get DexOS to use the program, also if you card does not suport 800*600 32bpp, you can mod the code to any res.
Also note anything goes to get the FPS.
Attachments
Fps.zip
The program and source code (fasm)
(52.59 KiB) Downloaded 126 times
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: vesa. vga, graphics in general

Post by Brendan »

Hi,
Dex wrote:Here is the test program, it runs on DexOS to use it get DexOS from my site and use the program in the below zip (FpsTest.dex).
This code has a bug - there's no guarantee that all pixels are visible. For example, the video display memory might be arranged as 1024 pixels wide where only the first 800 pixels are sent to the monitor. This is why the VBE mode information structure tells you the number of bytes per scanline.

You'd need to do something like (for 32-bpp):

Code: Select all

    movzx ebx, word [ModeInfo_YResolution]
    mov edx, [ModeInfo_PhysBasePtr]
    movzx ebp,word [ModeInfo_XResolution]
    movzx eax, word [ModeInfo_LinBytesPerScanLine]
    mov esi, VesaBuffer 
.nextLine:
    mov ecx, ebp
    mov edi, edx
    rep movsd
    add edx, eax
    sub ebx, 1
    ja .nextLine
Next, I'd add these lines of code to your "BuffToScreen" routine:

Code: Select all

BuffToScreen:
    cmp byte [bufferChangedFlag], 0     ;Did the buffer change since last time?
    je .done                            ; no, skip it

   ; ** other stuff **

    mov byte [bufferChangedFlag], 0
.done:
    ret
With this small change you'll probably get many million frames per second. Of course if there was something actually modifying the data you'd need to implement support for dirty rectangles or something.

Using dirty rectanges can make a huge difference, but it can be complex to implement. I cheat. Instead I just have a flag for each horizontal line of pixels. For e.g. for 800 * 600 mode you'd need 600 flags, or 75 bytes (one bit per flag). In this case, for 32-bpp you'd do something like:

Code: Select all

    xor ebx, ebx                  ;ebx = current screen line
    mov edx, [ModeInfo_PhysBasePtr]
    movzx ebp,word [ModeInfo_XResolution]
    movzx eax, word [ModeInfo_LinBytesPerScanLine]
    mov esi, VesaBuffer 
.nextLine:
    btr dword [lineModifiedFlags], ebx  ;Was the screen line modified?
    jnc .skipLine                       ; no, skip it completely
    mov ecx, ebp
    mov edi, edx
    rep movsd
.skipLine:
    add edx, eax
    inc ebx
    cmp ebx, word [ModeInfo_YResolution]
    jb .nextLine
Of course there's no real reason that you couldn't use a flag like this for smaller areas. For e.g. you could have one flag for each pixel, or one flag for each group of 4 pixels, or something else.

Next, I get performance improvements by using an additional buffer. The basic idea is that for each dword you do something like:

Code: Select all

    mov eax, [sourceBuffer + edx]
    cmp [changeBuffer + edx],eax       ;Has this dword changed?
    je .skip                           ; no, don't need to update video display memory
    mov [changeBuffer + edx],eax       ;Update the change buffer
    mov [videoDisplayMemory + edx],eax ;Update the video display memory
.skip:
    add edx,4
This improves performance because accessing RAM is a lot faster than accessing video display memory (there's no PCI bus bandwidth limitation for RAM accesses). Of course you can do more than one dword at a time to hide some of the RAM access latencies - for e.g.:

Code: Select all

    mov eax, [sourceBuffer + edx]
    mov ebx, [sourceBuffer + edx + 4]
    mov ecx, [sourceBuffer + edx + 8]
    mov ebp, [sourceBuffer + edx + 12]
    cmp [changeBuffer + edx],eax           ;Has this dword changed?
    je .skip1                              ; no, don't need to update video display memory
    mov [changeBuffer + edx],eax           ;Update the change buffer
    mov [videoDisplayMemory + edx],eax     ;Update the video display memory
.skip1:
    cmp [changeBuffer + edx + 4],ebx       ;Has this dword changed?
    je .skip2                              ; no, don't need to update video display memory
    mov [changeBuffer + edx + 4],ebx       ;Update the change buffer
    mov [videoDisplayMemory + edx + 4],ebx ;Update the video display memory
.skip2:
    cmp [changeBuffer + edx + 8],ecx       ;Has this dword changed?
    je .skip3                              ; no, don't need to update video display memory
    mov [changeBuffer + edx + 8],ecx       ;Update the change buffer
    mov [videoDisplayMemory + edx + 8],ecx ;Update the video display memory
.skip3:
    cmp [changeBuffer + edx + 12],ebp       ;Has this dword changed?
    je .skip4                               ; no, don't need to update video display memory
    mov [changeBuffer + edx + 12],ebp       ;Update the change buffer
    mov [videoDisplayMemory + edx + 12],ebp ;Update the video display memory
.skip4:
    add edx,16
Notes: This technique avoids some video display memory accesses that can't be avoided using any other technique - e.g. if you do something that sets a pixel to the same colour it used to be. I normally use this for 16 colour modes, where the colour of each pixel is spread across 4 different planes (one bit per pixel per plane), so that (for e.g.) if a cyan pixel is changed to white you only need to update 2 planes (the red and intensity planes) instead of 4 planes (because the blue and green planes don't change).

This does cost more memory though, so you might want to use the normal method on systems with low RAM and this method when there's plenty of RAM; or work out which method to use based on video buffer size and amount of RAM, for e.g.:

Code: Select all

    if(videoBufferSize * 32 < totalRamSize) useNormalMethod();
    else useChangeBufferMethod();
Basically, I'd end up with several different routines for each BPP (with/without change buffer, with/without SSE, etc), where initialization code selects which routine to use and code does "call [blitRoutine]". Then I'd have several versions of each routine that does the drawing - for e.g. several "drawRectangle()" routines (with/without change buffer, with/without SSE, etc) where initialization code would select which routine to use and code does "call [drawRectangleRoutine]"; and several "drawCircle" routines, several "drawFont" routines, several "drawLine" routines, several "drawBMP" routines, several "drawIcon" routines, etc.

Mostly, what I'm trying to say here is that you can make huge improvements to the performance of the blit routines by doing a little more work when drawing. You can't just look at the performance of the blit routine alone, you need to look at the performance of the drawing routines and the blit routines together. In other words, if you always blit everything regardless of whether it changed or not, then your code will always give bad performance, and there's no point measuring how bad the performance is when you know the code needs to be ripped out and replaced.


Cheers,

Brendan
User avatar
Dex
Member
Member
Posts: 1444
Joined: Fri Jan 27, 2006 12:00 am
Contact:

Re: vesa. vga, graphics in general

Post by Dex »

Thank Brendan, lots of helpfull info, ( as usual) i agree with all your points, i will add some of your suggestion to the FPS test code.
Thanks again.
Post Reply