Page 1 of 1

Is it easy to do hardware bitblts?

Posted: Sun Apr 29, 2007 9:26 am
by blackoil
My Gfx card is nVidia Geforce2 MX 400. I want to move a block of data direclty in LFB, where can I find related references? Thanks!

Posted: Sun Apr 29, 2007 9:32 am
by xsix
am, you can use VESA, if your VBE version is 2.0 or higher. It can give you LFB of your card's buffer. But if you wanna code your hardware with INs and OUTs, then hm =\

Posted: Sun Apr 29, 2007 9:41 am
by blackoil
I can scroll scanline by scanline in software "mov esi,line2; mov edi,line1; mov ecx,scanline*fontheight; rep movsd (i am in 32bpp).

But it's still slow on real machine. I try CRTC register, index 0x8, scanline scroll, but it seems just affect the scanline position on monitor only, can't change the content in LFB.

I want to be done in hardware. I can't google any related document for this.

Posted: Sun Apr 29, 2007 9:49 am
by Kevin McGuire
I think the VBE stuff in the post above supports some SVGA functionality, which IIRC has hardware bltblt support in some cases or cards?

Anyway. I am just trying to help throw some information out on it.

Posted: Sun Apr 29, 2007 11:58 am
by mystran
I'd say hoping for generic VBE BitBlit is probably a dream. I could be wrong ofcourse...

Anyway, without having any idea about NVidia cards, I know there's a driver for them in Xorg, so you could fetch the mess-of-a-source-tree and look there for pointers maybe?

Posted: Sun Apr 29, 2007 1:13 pm
by Brendan
Hi,
mystran wrote:I'd say hoping for generic VBE BitBlit is probably a dream. I could be wrong ofcourse...
The normal video standard people talk of is the "VBE core" standard. There's another one called the "VBE AF" (Accelerated Functions?) that covers hadware bitblits, etc.

The problem was that people had to pay to get "VBE AF", and it was never free. This meant that no programmers used it, and (AFAIK) most video cards never bothered supporting it.

I'd also assume that the "VBE AF" standard itself doesn't cover the very wide variety of different things that modern video cards are capable of (but I'm only guessing - like most people interested in video, I've never seen the "VBE AF" standard).

In any case, the important thing is that your "video driver programming interface" can handle hardware bitblits so that eventually (maybe) one day, it'd be possible for someone to write a video driver that supports hardware bitblits.

The same thing applies to "stereoscopic" and refresh rates (both supported by "VBE core"), and SLI, polygons, shaders, hardware cursors, gamma correction, monitor capability detection, etc - design everything so it's technically possible for a device driver writer to support it in hardware, but then just write a generic video driver that emulates things in software.


Cheers,

Brendan

Posted: Sun Apr 29, 2007 3:11 pm
by urxae
Brendan wrote:The normal video standard people talk of is the "VBE core" standard. There's another one called the "VBE AF" (Accelerated Functions?) that covers hadware bitblits, etc.

The problem was that people had to pay to get "VBE AF", and it was never free. This meant that no programmers used it, and (AFAIK) most video cards never bothered supporting it.
I've got a VBE/AF document from VESA. It's labeled VBE/AF 1.0 (revision 0.7, adoption date august 18, 1996). How to get it:
* go to http://vesa.org/Store/buystandards.htm.
* Click the link in "A list of Free Standards is available here"
* Fill in the registration form. You can fill in total nonsense if you prefer (though the email address needs to look valid).
* Follow instructions
* (Make sure javascript is enabled)
* Click VBE (on the left)
* Click "VBE-AF07.pdf"

I've also seen mention of a 2.0 (draft?) standard. Can't find the actual document though.
A few free drivers can apparently be downloaded here. Most are marked as "Uses BIOS" (for video mode initialization) though, which according to the page means they only run in DOS (but they may mean real mode, and emulation may work...)
I'd also assume that the "VBE AF" standard itself doesn't cover the very wide variety of different things that modern video cards are capable of (but I'm only guessing - like most people interested in video, I've never seen the "VBE AF" standard).
It mentions it's supposed to be a hardware-independent device interface supporting Accelerator Functions for efficiently hardware-implementable and commonly used 2D operations.
The fact that it's over 10 years old (or at least the doc I found is) may mean the line of "efficiently hardware-implementable" functionality has shifted a bit since then, though...


(Not that I'm anywhere near needing this stuff myself...)

Posted: Sun Apr 29, 2007 11:55 pm
by blackoil
I checked some files mentioned above, especially the XF86 driver for nVidia. It seems hardware acceleration is done with MMIO DMA transfer. But I can't find a clear definition list for IO ports.

I also checked S3 chip driver source in freebe, only 2 files there. it use io port.

void blitvram ( int srcx, int srcy, int dstx, int dsty, int dx, int dy, int dir )
{
do ; while ( inportb(0x9AE8) & 0xFF );
outportw(0xBAE8, 0x67);
outportw(0xBEE8, 0xA000);
outportw(0x86E8, srcx);
outportw(0x82E8, srcy);
outportw(0x8EE8, dstx);
outportw(0x8AE8, dsty);

outportw(0x96E8, dx-1);
outportw(0xBEE8, dy-1);
do ; while ( inportb(0x9AE8) & 0xFF ); /* 0xc013 = 14,15,7,0,2 bits set */
if ( dir ) outportw(0x9AE8, 0xC013); else outportw(0x9AE8, 0xC0F3);
/* 0xc0f3 = 14,15,8,5,1,0 bits set */
};

Posted: Sun Apr 29, 2007 11:59 pm
by Brendan
Hi,
urxae wrote:I've got a VBE/AF document from VESA. It's labeled VBE/AF 1.0 (revision 0.7, adoption date august 18, 1996). How to get it:
Thanks! I've dowloaded it and quickly read through it - seems like it would've been nice...
urxae wrote:A few free drivers can apparently be downloaded here. Most are marked as "Uses BIOS" (for video mode initialization) though, which according to the page means they only run in DOS (but they may mean real mode, and emulation may work...)
This page makes me think VBE AF was never included in any video card's ROM, and the only way to get VBE AF to work is to obtain a "VBE AF driver" for your video card. The FreeBE/AF project is probably the only way to obtain a "VBE AF driver" (I doubt SciTech are still distributing theirs, and I doubt their licencing is suitable for inclusion in an OS project's distrubution files if they are).

Unfortunately there's a few problems with the FreeBE/AF source. For example, I/O ports used by each video card are hard-coded (it'll only work for the first video card in the computer), some of the video drivers have partial support (i.e. don't hardware accelerate VBE/AF functions that the video card can hardware accelerate), most of them require BIOS for setting video modes (a problem for me, as I have no intention of supporting virtual 80x86, especially for the 64-bit version of my OS), etc.

However, the FreeBE/AF project has full source code and uses a "free" copyright ("sources may be distributed and modified without restriction").

Therefore, I'd be very tempted to forget about VBE AF and use the FreeBE/AF source code as reference information (in conjunction with information from other places) while implementing native video device drivers for your OS.


Cheers,

Brendan

Posted: Mon Apr 30, 2007 1:16 am
by Brendan
Hi,
blackoil wrote:I checked some files mentioned above, especially the XF86 driver for nVidia. It seems hardware acceleration is done with MMIO DMA transfer. But I can't find a clear definition list for IO ports.

I also checked S3 chip driver source in freebe, only 2 files there. it use io port.
I can't help with NVidea, but I do have some information on S3 video cards...

Code: Select all

/* BitBlt:
 *  Blits from one part of video memory to another, using the specified
 *  mix operation. This must correctly handle the case where the two
 *  regions overlap.
 */
void blitvram ( int srcx, int srcy, int dstx, int dsty, int dx, int dy, int dir )
{
  do ; while ( inportb(0x9AE8) & 0xFF );
  outportw(0xBAE8, 0x67);
  outportw(0xBEE8, 0xA000);
  outportw(0x86E8, srcx);
  outportw(0x82E8, srcy);
  outportw(0x8EE8, dstx);
  outportw(0x8AE8, dsty);

  outportw(0x96E8, dx-1);
  outportw(0xBEE8, dy-1);
  do ; while ( inportb(0x9AE8) & 0xFF ); /* 0xc013 = 14,15,7,0,2 bits set */
  if ( dir ) outportw(0x9AE8, 0xC013); else outportw(0x9AE8, 0xC0F3);
					 /* 0xc0f3 = 14,15,8,5,1,0 bits set */
};
The card has a FIFO queue for commands and data, where software can keep stuffing new commands and data into it as long as the FIFO queue isn't full. It's only a small queue (8 entries) though.

The first line "do ; while ( inportb(0x9AE8) & 0xFF );" waits until all 8 entries in the FIFO queue are empty.

The next line "outportw(0xBAE8, 0x67);" sets the "Foreground Mix Register" so that the source data replaces the existing data (and doesn't mix the source and destination data using boolean operations like "dest = source NOT dest" or "dest = source AND dest").

The next line "outportw(0xBEE8, 0xA000);" sets the "Multifunction Control Register" with "index = 0xA" and "data = 0x000". Index 0xA corresponds to a Pixel Control Register, where "data = 0x000" means don't pack the data (source data is in same format as destination data) and Foreground Mix is used.

The next 4 lines set the starting X and Y co-ordinates for the source data (I/O ports 0x86E8 and 0x82E8), and for the destination data (I/O ports 0x8EE8 and 0x8AE8).

The next 2 lines set the width of the rectangle (I/O ports 0x96E8) and the height of the rectange (I/O ports 0x96E8).

The next line "do ; while ( inportb(0x9AE8) & 0xFF );" waits until all 8 entries in the FIFO queue are empty again. I don't know why this is done - my information suggests this isn't necessary.

The last line sends the "go" command. Be carefull here as the comments in the source saying which bits are set are wrong. Bits 13 to 15 determine the command type (BitBlit), Bits 5 and 7 determine the orientation of the rectange (if it's flipped horizontally or vertically), bit 2 determines if the last pixel of each line is drawn or not, bit 1 determines the pixel mode and bit 0 enables/disables writes.

I should also point out that all of the registers used here are "S3 Enhanced Registers" that need to be unlocked before they can be used. These registers can also be memory mapped to improve performance (so that you can write to physical addresses between 0xA0000 and 0xAFFFF instead of using I/O ports).

Please note that the information I used to figure this out comes from VGADOC4b.ZIP and from a book I've got called "Programmer's Guide to the EGA, VGA, and Super VGA Cards".


Cheers,

Brendan

Posted: Mon Apr 30, 2007 2:49 am
by blackoil
Thanks for detailed information from Brendan. I still have a Pentium 133 with S3 VirgeGX 2. I will try it tomorrow night. Do you think nVidia will expose their IO port reference document?

Posted: Mon Apr 30, 2007 2:51 am
by blackoil
BTW, is there pdf version of "Programmer's Guide to the EGA, VGA, and Super VGA Cards"?

Posted: Mon Apr 30, 2007 5:55 am
by Combuster
blackoil wrote:Thanks for detailed information from Brendan. I still have a Pentium 133 with S3 VirgeGX 2. I will try it tomorrow night. Do you think nVidia will expose their IO port reference document?
AFAIK NVidia keeps all information proprietary. Nevertheless there are some sites that document 2D accellerator functions. If you scrape enough bits and pieces together you might be able to collect enough to write a driver. I don't know any URLs but a few were posted on this board - you might want to look for those.
blackoil wrote:BTW, is there pdf version of "Programmer's Guide to the EGA, VGA, and Super VGA Cards"?
Given that the link points to amazon.com, chances are pretty nonexistant that there is a free and legal pdf version.

p.s. the only publicly available book on graphics hardware i know of is "Michael Abrash - Graphics Programming Black Book". It only contains info about the VGA though.

Posted: Sat May 05, 2007 7:10 am
by blackoil
After I have read few source codes ( xfree86, freebe, openbe ), I found the precedure is like: find nvidia device in PCI configuration space, then map their PCI I/O registers to host memory and initialize them. Hardware acceleration perform in FIFO command/data queue manner. Is it correct?