SVGA in 32bit protected mode ?

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
AirFlight
Member
Member
Posts: 26
Joined: Wed Dec 06, 2006 6:34 am

Post by AirFlight »

Oh Bus Master is acting like DMA :? . Maybe this is right way.
Also you need to program latency timer to lower values.


/* ---
masks for status and command register bits
--- */
#define AGP_2_1x 0x00000001 /* AGP Revision 2.0 1x speed transfer mode */
#define AGP_2_2x 0x00000002 /* AGP Revision 2.0 2x speed transfer mode */
#define AGP_2_4x 0x00000004 /* AGP Revision 2.0 4x speed transfer mode */
#define AGP_3_4x 0x00000001 /* AGP Revision 3.0 4x speed transfer mode */
#define AGP_3_8x 0x00000002 /* AGP Revision 3.0 8x speed transfer mode */
#define AGP_rates 0x00000007 /* mask for supported rates info */
#define AGP_rate_rev 0x00000008 /* 0 if AGP Revision 2.0 or earlier rate scheme, 1 if AGP Revision 3.0 rate scheme */
#define AGP_FW 0x00000010 /* 1 if fastwrite transfers supported */
#define AGP_4G 0x00000020 /* 1 if adresses above 4G bytes supported */
#define AGP_SBA 0x00000200 /* 1 if sideband adressing supported */
#define AGP_RQ 0xff000000 /* max. number of enqueued AGP command requests supported, minus one */
#define AGP_RQ_shift 24


http://web.inter.nl.net/users/be-hold/BeOS/NVdriver/download.html
http://web.inter.nl.net/users/be-hold/BeOS/Downloads/agp_driver_V2.00.zip


This is it.
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Post by Combuster »

AirFlight wrote:Thank you,
http://bos.asmhackers.net/docs/vga_without_bios/
I have been hounting for it whole my life. :D
Now it is time to study this code.
Concerning VGA programming, you might want to read these as well:
http://www.osdever.net/FreeVGA/vga/vga.htm - most complete VGA reference around (and rather educational as well)
http://home.worldonline.dk/finth/ - for details on many other graphics cards
http://dimensionalrift.homelinux.net/co ... vga_io.bas - sample vga mode setting code from my own OS

And for AGP:
http://www.osdev.org/osfaq2/index.php/AGP%20information
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
AirFlight
Member
Member
Posts: 26
Joined: Wed Dec 06, 2006 6:34 am

Post by AirFlight »

Thanks!
When i find simplest way to activate Bus Mastering i will send code.
MagicalTux
Posts: 22
Joined: Mon Dec 04, 2006 5:34 pm

Post by MagicalTux »

Wow, I'm happy my post got so many answers.

Well, for my OS, decided to use the "easy way". At boot, I list the PCI devices and detect the QEMU card. Card 1234:1111 is "dummy VGA card with Bochs VESA extensions", and loads the appropriate gfx module.

This module was built thanks to core's code and provides a generic interface for gfx init (specific to my os).
When the OS asks for a gfx mode (eg, 800x600x32), the appropriate function is called, asks the gfx mode, and checks if it was set successfully. It also includes a dummy function that just returns a pointer to 0xe0000000, and provides some functions for gfx operations (like writing a text, or displaying an image).

So, basically, my kernel is able to detect QEMU (and soon vmware too), init it, and tell us if everything's OK.

Now we're working on mouse code, while some other people work on making gfx drivers for some other kinds of gfx cards~ I'd say everything is going well :)
AirFlight
Member
Member
Posts: 26
Joined: Wed Dec 06, 2006 6:34 am

Post by AirFlight »

I wanted my os FPS to work smooth. Firstly i tought that i have to activate AGP features, but then i remembered that i have Voodoo 3 PCI ,i put it in my pci slot and it worked perfectlly fine on XP, everything was smooth. It is not AGP, than i looked at captabilities and i found that it does not support bus-mastering even.


So my conclusion is if you want to have something 2d working good PCI slot is absolutely good, even withot busmaster it could be made to work fine.
PCI speed 133MB/s = 30FPS * 4BYTE * 1024 * 1024

When i run XP safe mode display was so slow.
When i run XP in normal mode it was ok.
So it is not because of PCI vs AGP.


My os in VESA 1024x768x32 it terribly slow.
Maybe XP is using some tricks and not putting images to frame buffer directly. Maybe it uploaded many images before to Video memory, and then it just sends pointers to GPU which then write to second Frame Buffer at speed of 30GB/S.


Control Panel->Display->Settings->Advanced->TroubleShoot
It was pressent since win95. Here you can disable or enable hardware acceleration (cursor drawing, bitmap drawing, shapes drawing) which is controled by drivers. If you disable it it will be just like Safe Mode or my Os in VESA 1024x768x32 mode.
So it is not all in PCI or AGP bus.

1. Video memory Memory manager should be bulit to keep informations about upoloaded image bitmaps.
2. Somehow i need to send command to GPU on image transaction with parameters Source, Destination, WIdth, Heigth


Some help please.
Just do it
User avatar
Dex
Member
Member
Posts: 1444
Joined: Fri Jan 27, 2006 12:00 am
Contact:

Post by Dex »

This may help with your card: http://homepage.swissonline.ch/tinyasm/v3.htm
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Post by Brendan »

Hi,
AirFlight wrote:I wanted my os FPS to work smooth. Firstly i tought that i have to activate AGP features, but then i remembered that i have Voodoo 3 PCI ,i put it in my pci slot and it worked perfectlly fine on XP, everything was smooth. It is not AGP, than i looked at captabilities and i found that it does not support bus-mastering even.
For Voodoo3, fortunately the documentation is available online - try here if you haven't already got the documentation (I'm thinking you probably already have it and that's why you chose Voodoo3, but just in case)...

I've only had a brief look at it, but it seems to me that there's a lot of things that can be done to improve performance, including memory mapping the card's registers (faster access than using I/O ports), uploading your textures into the video card, building lists of commands for the video card to process (including "screen to screen" bit-blits, which I guess would actually do "video memory to video memory" bit-blits), page flipping, hardware (mouse) cursor, etc.

I'd also assume the card has a command FIFO buffer, where an IRQ is generated when the FIFO is empty (or close to empty), so that the driver can send some commands initially and then keep sending more commands each time an IRQ is received (so that the GPU is constantly fed with work until the entire list of commands is completed).

To me this means you'd have up to 3 frames at various stages (one that is being seen by the user, one that the video card is drawing and one that your driver is preparing), and you'd use steps that go something like this:
  • a) Build a list of actions and a list of things to uploaded
    b) Upload any new data
    c) Start sending the list of commands to the video card
    d) Build a new list of actions and a new list of things to uploaded (while sending any remaining commands from the previous list of commands each IRQ)
    e) Upload any new data (while sending any remaining commands from the previous list of commands each IRQ)
    f) Wait for the previous list of commands to be complete
    g) Do a page flip (so the results of the previous list of commands can be seen)
    h) Free any video memory that is no longer being used
    i) Start sending the new list of commands to the video card
    j) Go back to step d
Hmm, forget the above. It looks like the card has a pair of independant command FIFOs, so it can process 2 seperate lists of commands at the same time!

Ok, I don't know enough about this video card and I'm only guessing. I'd probably spend several months trying to understand the documentation and experimenting before I started writing a device driver for it.... ;)


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Post by Brendan »

Hi,

Just another quick note - you might want to consider setting up a "dual display" computer specifically for video device driver experiments. The idea is that you use one video card to see what you're doing while you mess about with the other video card. That way you could interactively/manually view and/or modify the video card's registers to see what happens (using some sort of temporary command line tool perhaps). :)


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
AirFlight
Member
Member
Posts: 26
Joined: Wed Dec 06, 2006 6:34 am

Post by AirFlight »

Thanks for info. My goal, for now, are simple generic 2d commands.
Fortunatelly my main card is GeForce 5200 so i will seek for asm commands which are the same for both.


Cheers!
Just do it
User avatar
Candy
Member
Member
Posts: 3882
Joined: Tue Oct 17, 2006 11:33 pm
Location: Eindhoven

Post by Candy »

AirFlight wrote:So my conclusion is if you want to have something 2d working good PCI slot is absolutely good, even withot busmaster it could be made to work fine.
PCI speed 133MB/s = 30FPS * 4BYTE * 1024 * 1024

When i run XP safe mode display was so slow.
When i run XP in normal mode it was ok.
So it is not because of PCI vs AGP.

My os in VESA 1024x768x32 it terribly slow.
PCI/AGP is both PCI, PCI has a number of different modes of operation. First off, the video card is a PCI device that has a number of bytes of data and 4 shared interrupt lines (3 of which are commonly not used). PCI data transmissions are either byte/word/dword transmissions, or burst transmissions. The transmissions can be started from either side, the processor (normal transaction) or the target (bus-mastering, so-called).

The transmissions happen as follows:
Bus arbitration occurs, the initiator contacts the other side, sends information about the type of transaction, starts the transaction and ends it.

If this happens for every byte (normally in your hobby OS or in VESA mode), you have an overhead of 10 PCI cycles or such per byte. Each PCI cycle is 30ns, so that's 300 nanoseconds per byte. That's a top speed of 3.3MB/s, assuming that the rest of the PCI cards are idle.

If you do this once per cache line of data, you have the 9 cycles of overhead from above (minus the one for transmitting a byte of data), plus 16 cycles of data submission. That's 25 cycles for 64 bytes of data. Same calculation as above, that's 82.5 MB/s, still assuming the rest is idle.

This assumes that your cache system / pci controller are smart enough to use PCI burst transactions (and believe me, if they weren't your computer would be lying in a gutter somewhere due to performance problems). So, the main trick is to enable caching that doesn't write to video memory directly but only when the cache line is full, and using few cache lines at the same time. So, write from left to right, line per line from top to bottom and set the caching to, for example, write-back. That should give you proper speed for the operations.
Brendan wrote:Just another quick note - you might want to consider setting up a "dual display" computer specifically for video device driver experiments. The idea is that you use one video card to see what you're doing while you mess about with the other video card. That way you could interactively/manually view and/or modify the video card's registers to see what happens (using some sort of temporary command line tool perhaps). :)
I was going for a serial terminal, but a second monitor is of course equally usable.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Post by Brendan »

Hi,
Candy wrote:
Brendan wrote:Just another quick note - you might want to consider setting up a "dual display" computer specifically for video device driver experiments. The idea is that you use one video card to see what you're doing while you mess about with the other video card. That way you could interactively/manually view and/or modify the video card's registers to see what happens (using some sort of temporary command line tool perhaps). :)
I was going for a serial terminal, but a second monitor is of course equally usable.
Equally usable, however using a second monitor has the additional advantage of allowing a developer to be sure that their code initialises the video card completely (i.e. doesn't rely on any prior initialization done during boot by the video card's ROM) and doesn't assume anything about memory ranges or I/O port ranges. Once you know a video card doesn't rely on any of these things you can be (almost) sure that it will work when there's only one video card in the computer, or when there's N completely different video cards... ;)


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
AirFlight
Member
Member
Posts: 26
Joined: Wed Dec 06, 2006 6:34 am

Post by AirFlight »

I started Agp 4x Sideband Fast-write by writing
Intel 850 command register PCI[800000a8h], 314h
GeForce 5200 command register PCI[8001004ch], 1f000314h


Is it Now AGP ENABLED?
i wasnt sure that it was enabled so in Win Xp i tested
Intel 850 command register out 800000a8h, 312h
GeForce 5200 command register out 8001004ch 1f000312h
And when i started Nvidia Dysplay panel it said that GeForce is running
Agp 2x, well it is for sure making some changes.


So when my Os test is booting it does this steps:
1. turn on 1024x768x32 VESA
2. Enable AGP 4x FW SB
3. Start infinity loop of directly writing 1024x768x24 bitmap to GeForce Prefetchable Memory Framebuffer at 0d0000000h, and that bitmap is moving from left to right (LIKE BILLBOARD).
4. All Interrupts Are off
I measured time which bitmap needs to make one (left corner screen - righ corner screen cycle) and it is 60sec. Because resolution is 1024 that meens that that mutch bitmaps are sent to card in that time so it is
1024/60 = 17 FPS. And it is 17FPS*1024*768*4 = 51MB/S.
I just do not know how to get it faster.
When i disabled AGP, drawing time did not changed.
One thing that i discovered is that
Using MMX is twice faster than using eax is twice faster than using ax is twice faster than using al.
When i tried to move framebuffer by SSE 128-bit there was no response.


What is that cache line size and is it enabled.
All that 2d acceleration things are good but if you whant to see movie which is 1024x768x32 and has to be directly sent in real time all that 2d acceleration is useless.
As i found AGP card is busmaster and it should start transactions by addresing AGP port which is target. If you try to move Data with CPU to framebuffer then GeForce is acting like PCI target so moving to address 0d0000000h is maybe wrong. Mybe i sould move Frame data by CPU to AGP aperature memory and then Card will load it itself.
I have several addreses and i tryed to write framebuffer at all of them after turning on AGP port.
d0000000h - SLOW
d8000000h - NOTHING
dc000000h - NOTHING
e0000000h - CARD CRASHES AND MONITOR TURNS OFF
Dos anyone have any idea what should i try next to increase framerate and how to use benefits of AGP operations.
Just do it
AirFlight
Member
Member
Posts: 26
Joined: Wed Dec 06, 2006 6:34 am

Post by AirFlight »

1. I just found that it is true that AGP can only use GPU by itself. All other things like CPU writes to framebuffer is just PCI 66MHZ transaction. After enabling AGP and other fast features it is only left to program GPU master to fill his framebuffer from system memory, and to rise an interrupt after that. AGP target is as i found Northbridge which is also like PCI Device and it have GART just below his Status and Command register.
So my frame buffer could be at address 200000h and i can map it through GART and Program GPU to take it to d0000000h.
Is that true?


2. Other thing that i notices is that if you write to VESA window mode directlly to framebuffer, speed increases 2x when you enable AGP. If you write to VESA linear framebuffer mode there is no change in speed with AGP ON or OFF.


3. Did anyone of you succeeded to attach this SNAP Graphics driver to his OS.
http://www.scitechsoft.com/products/product_download.html
Well if nobody knows how to program GPU to fill his framebufffer by itself or every graphics card is for that simple transaction programed diferentlly i guess that this SNAP driver is Future. I see it like new VESA.
SciTechsoft says that it support


3dfx Voodoo Banshee, Voodoo3, Voodoo4, Voodoo5

3DLabs Permedia, Permedia 2, Permedia 2V, Permedia 3

Alliance ProMotion 6422, ProMotion AT24, ProMotion AT3D

AMD Geode GX2

ARK 2000PV, 2000MT, 2000MI (Quadro64), 2000MI+ (Quadro64)

ATI Mach64 GX, Mach64 CX, Mach64 CT, Mach64 VT, 3D Rage, Mach64 VTB, 3D Rage II, 3D Rage II+, Mach64 VT4, 3D Rage IIC, 3D Rage Pro, 3D Rage LT Pro, Rage Mobility, Rage XL, Rage 128, Rage 128 Pro, Rage 128 Ultra, Rage Mobility 128, Rage Mobility 128-D4x, ES1000, Radeon 7200, Radeon 7000, Radeon IGP 320M / 340M, Mobility Radeon, Mobility Radeon 7000 IGP, Radeon 7500, Mobility Radeon 7500, Radeon 8500, Radeon 8500DV, Mobility Radeon 9000, Mobility Radeon 9000 IGP, Radeon 9000 Series, Radeon 9100 Pro IGP, Radeon 9100, Mobility Radeon 9200, Radeon 9200 Series, Radeon 9500, Radeon 9500 Pro, Radeon 9550, Mobility Radeon 9550, Mobility Radeon 9600, Radeon 9600 Series, Radeon 9600 XT, Radeon 9700 Pro, Mobility Radeon 9800, Radeon 9800, Radeon 9800 Pro, Radeon 9800 XT, Mobility Radeon X300, Mobility Radeon X600, Mobility Radeon XPress 200, Radeon XPress 200, Radeon X300 Series, FireMV 2200, Radeon X550, Radeon X600 Series, Radeon X700 Series, Mobility Radeon X800, Radeon X800 Series, Radeon X850 Series

Chips & Technologies 65548, 65550, 65554, 65555, 69000

Cirrus Logic CL-GD7543 LCD, CL-GD5434, CL-GD5440, CL-GD5436, CL-GD5446, CL-GD7555 LCD, Laguna 5462, Laguna 5464, Laguna 5465

Cyrix MediaGX

IBM VGA Compatible

InteGraphics CyberPro 2000, CyberPro 2010

Intel i740, i740 PCI, i810, i810/DC100, i810e, i815, i845G/GL/GV, i852/i855 GM/GME, i865G/GL/GV, i915G/GV, i915GM/GMS, i945G, i945GM

Matrox MGA Millennium, MGA Millennium II, MGA Mystique, MGA Mystique 220, MGA-G100, MGA-G200, MGA-G400, MGA-G450, MGA-G550, Parhelia, MGA-P750, MGA-P650

NeoMagic MagicGraph 128, MagicGraph 128ZV, MagicGraph 128XD, MagicGraph 256AV, MagicMedia 256AV+, MagicMedia 256ZX, MagicMedia 256XL+

Number Nine Imagine 128, Imagine 128 II, Imagine 128 II VRAM, Imagine 128 II DRAM, Ticket 2 Ride WRAM, Ticket 2 Ride SGRAM, Ticket 2 Ride IV

NVIDIA RIVA-128, RIVA-128ZX, RIVA-TNT, RIVA-TNT2, RIVA-TNT2 M64, RIVA-TNT2 Vanta, RIVA-TNT2 Ultra, GeForce 256, GeForce DDR, Quadro, GeForce2 Integrated GPU, GeForce2 Ti, GeForce2 GTS, GeForce2 MX 100/200, GeForce2 MX/MX 400, GeForce2 Ultra, GeForce4 MX 420, GeForce3, Quadro2, GeForce4 MX 440, GeForce4 MX 440 8X, GeForce4 MX 460, GeForce4 MX 4000, GeForce4 Integrated GPU, Quadro4 NVS, GeForce4 Ti 4200, GeForce4 Ti 4200 8X, GeForce4 Ti 4400, GeForce4 Ti 4600, GeForce4 Ti 4800, Quadro4 XGL, GeForce FX 5200, GeForce FX 5500, GeForce FX 5600 Series, GeForce FX 5700 Series, GeForce PCX 5300, GeForce PCX 5750, Quadro FX, GeForce 6200 Series, GeForce 6600 Series

OAK Spitfire 64107, Spitfire 64111, Eon 64017, Eon 64217, Warp 5

Philips 9710

Rendition Verite V1000, Verite V2200

S3 Vision 864, Vision 964, Vision 868, Vision 968, Trio32, Trio64, Trio64V+, Trio64UV+, Trio64V2/DX, Virge, Virge/DX/GX, Virge/VX, Virge/GX2, Virge/MX, Trio3D, Trio3D/2X, Savage3D, Savage4, ProSavage (VIA PM133), ProSavage (VIA KM133), ProSavage (VIA PN133), ProSavageDDR (VIA PM266), ProSavageDDR (VIA KM266), Savage/MX/IX, SuperSavage/IXC, Savage2000

Sigma Designs RealMagic 64 GX

Silicon Motion LynxEM, Lynx3DM

SiS 6202, 6205, 6215, 5597/5598, 6326, 300, 305, 630, 315, 730, 5595/530, 5595/620

Trident TGUI9440, TGUI9440-R2, TGUI9680, ProVidia 9682, Cyber9385 LCD, ProVidia 9685, 3DImage 975, Cyber9397 LCD, 3DImage 985, Blade 3D, Blade 3D (VIA VT8501), Blade 3D (VIA VT8601)

Tseng Labs ET4000/W32p, ET6000, ET6100

VESA VBE 1.2, VBE 2.0, VBE 3.0

VIA CLE266, P4M800/VN800/CN800

Weitek P9000, P9100


That is quite enough for me. :)
After downloading there is one file which is about 18MB.
Just do it
Post Reply