Page 1 of 3

Graphics driver interface design

Posted: Sun Dec 09, 2007 7:42 pm
by mystran
So.. I wrote a minimal VGA 320x200x8 (rrrgggbb truecolor) driver yesterday, and while I'll probably have to figure out how to get VBE modes or something as fall-back modes when no special drivers exists, this driver has a dual purpose: being the generic driver on which to base other drivers.

I thought I'd post about it here, in order to open a discussion. The current driver interface looks like this:

Code: Select all


typedef struct vga_display_S vga_display;
struct vga_display_S {

    // mode data: width/height for clipping, p/d for access..
    // to blog a byte at (x,y) take data[x*d+y*p] = value
    struct {
        unsigned short w; // width
        unsigned short h; // height
        unsigned short p; // pitch = number of bytes in memory between lines
        unsigned short d; // depth = number of bytes per pixel

        struct { char r,g,b,a; } bits;  // number of bits for each channel
        struct { char r,g,b,a; } shift; // how many bits each channel is shifted
    } mode;

    void * buffer; // linear framebuffer address (default: 0xA0000)
    void * driver; // driver internal data, always 0 in default implementation

    // lock() the driver before accessing 'mode' or 'buffer' directly
    // unlock() the driver before calling any other driver functions
    //
    // The implementation need not be recursive, and lock should be blocking.
    //
    // A client should not rely on these, and it's fine to provide no-ops
    // where hardware doesn't mandate locking.
    //
    // The default implementation uses no-ops.
    //
    void   (*lock)(vga_display *);
    void (*unlock)(vga_display *);

    // setmode() set a given graphics mode
    // FIXME: need a protocol for detection/selection of available modes
    //
    // The default implementation only supports mode=0 (320x200 with RRRGGGBB)
    //
    // Returns 0 on success, non-zero on failure.
    int (*setmode)(vga_display *, int mode);

    // flip() updates the screen when the drivers is using hardware double
    // buffering or software emulation of linear frame-buffer.
    //
    // Software double-buffering should NOT be done at driver level.
    //
    // The default implementation uses no-op.
    //
    void   (*flip)(vga_display *);

    // blit() - copies an image from one address to another
    //
    // Implementations that provide acceleration should detect whether
    // either of the pointers point to video memory in order to choose
    // the correct strategy.
    //
    // When source data contains alpha-values, those are expected to be
    // copied as in to the destination (no alpha blending is done).
    //
    // Notice that the client is expected to do clipping.
    //
    // Returns 0 on success, non-zero on failure.
    //
    // The default implementation does a manual byte-by-byte copy.
    //
    int (*blit)(vga_display *,
            void * src_data, unsigned src_pitch,
            void * dest_data, unsigned dest_pitch,
            unsigned width, unsigned height);

};

vga_display * vga_init();

Ok, a bit longish for all the comments, but you get the point. Idea is that any driver init will call vga_init() and then override the setmode-pointer, and if necessary/desired other pointers as well. The base driver happens to work such that this is enough.

The driver model I have in mind is designed for truecolor only (bitdepth can vary, but I have no intention of supporting palettes). The above is still lacking mode-enumeration, not to mention alpha-blits and device-independent bitmaps (which probably should be taken into consideration on driver level as they'll probably be worth accelerating).

Anyway, I have two things in mind here... first of all, does the above look sensible to those (if any) who've written real accelerated drivers? Also, if somebody else has a generic graphics driver model already in place, does it look similar? If not, how does it work?

Finally, if we find that there are several people with similar ideas about how graphics drivers should work, would it be totally stupid idea to attempt to design a common interface, in order to share code?

Re: Graphics driver interface design

Posted: Mon Dec 10, 2007 3:03 am
by AJ
mystran wrote:would it be totally stupid idea to attempt to design a common interface, in order to share code?
Absolutely not - I would be up for this. The only problem from your point of view is that you are ready for this now, (and so would be left designing it) whereas I'm not going to be in that situation for another 6 months.

Supposing a common interface was designed (and yes, I'm aware of UDI and EDI), it would be possible to provide a facility where people could write drivers for known devices and, once tested, upload their code to a common web site / SVN repository or whatever. Even if your OS was going to have its own driver interfaces, if it could support the common interface as an extension, at least there would be a wealth of graphics boards it could support from the outset.

With regards to your other question, I haven't written accelerated drivers in the past (just VBE), but the interface looked scarily similar to yours (except it didn't have a setmode function). I say this is scary because I have never looked at another graphics driver and it's nice to know that others think in the same way!

If you get positive feedback and decide to try designing a common interface, I'll certainly try to help where possible.

Cheers,
Adam

Posted: Mon Dec 10, 2007 3:33 am
by liquid.silver
Would this common interface be strictly for graphics drivers or a general driver interface? I think it should probably be just for graphics. My reasoning being that it will be much simpler to design and could therefore possibly materialise. Later it could possibly be extended to other devices. I'm also far away from graphics at the moment, but i'm for the idea. Even if it isn't completed, much could be learnt from the process.

Posted: Mon Dec 10, 2007 3:57 am
by JamesM
As you ask for comments...

The locking mechanism is internal to the driver - it should be up to the driver's internal functions to ensure they are locked correctly, not the caller. Why can't every driver function start with lock() and end with unlock(), then you have only a few cases to probe if locking fails, instead of everywhere the driver is called!

I would be up for a common interface, but bear in mind that I (and others) use C++ classes. I would bend to using C only for this though ;)

Posted: Mon Dec 10, 2007 6:14 am
by Combuster
One of my concerns is that many cards have few actual limitations on resolutions. Only through VBE a limited set is exposed. A real VGA can be configured to have any multiple of 8 pixels wide screen, and any vertical resolution (up to a certain limit)

If you have tried windows programming you'll notice that mode changing calls require four parameters: resolution x, y, bpp (color format), refresh rate, the problem with mode numbers is wether you can even fit these into a 32bit number. On the other end, if you want to enumerate all supported modes, you get the same problems.

Besides, the monitor is the actual limit. Not the video card :wink:

Posted: Mon Dec 10, 2007 6:39 am
by mystran
JamesM wrote:The locking mechanism is internal to the driver - it should be up to the driver's internal functions to ensure they are locked correctly, not the caller. Why can't every driver function start with lock() and end with unlock(), then you have only a few cases to probe if locking fails, instead of everywhere the driver is called!
Read the comments again. The design is that the driver must be in "unlocked" state when any of it's functions is called. This is to say it must internally do the locking. The reason such functions exist in the interface is for direct buffer access: in order to provide generic software fall-backs for driver functions, one needs direct buffer access. If hardware requires locking for such direct access (say doesn't like direct poking when a bitblt is running), the locking has to be provided by the driver. IMHO it's better in such a case to export the locking, than it is to require all software fall-backs to be reimplemented for such drivers.

I would be up for a common interface, but bear in mind that I (and others) use C++ classes. I would bend to using C only for this though ;)
C interfaces are trivial to wrap, and easier to define properly. I use C++ for most of my own application development, yet I normally use Win32 C interfaces (on Windows anyway) because in any C++ library, there's just too many design desicions that can cause trouble.

Posted: Mon Dec 10, 2007 6:42 am
by mystran
liquid.silver wrote:Would this common interface be strictly for graphics drivers or a general driver interface? I think it should probably be just for graphics. My reasoning being that it will be much simpler to design and could therefore possibly materialise. Later it could possibly be extended to other devices. I'm also far away from graphics at the moment, but i'm for the idea. Even if it isn't completed, much could be learnt from the process.
Just for 2D graphics was my idea. Fully generic driver interfaces never materialize, and many other classes of devices can depend quite heavily on OS design decisions. Basic 2D graphics is a simple domain in the interface sense, because mostly it's just a set of functions that are implemented on hardware (instead of software). The worst case is having to catch an interrupt.

Posted: Mon Dec 10, 2007 6:47 am
by mystran
Combuster wrote:One of my concerns is that many cards have few actual limitations on resolutions. Only through VBE a limited set is exposed. A real VGA can be configured to have any multiple of 8 pixels wide screen, and any vertical resolution (up to a certain limit)

[...]

Besides, the monitor is the actual limit. Not the video card :wink:
Yeah, well, there's basically two sensible options. The other is to have some enumeration of modes, and the other is having a generic timing calculation framework with the driver providing the available limits.

My idea was that drivers either export a set of sensible (normal) modes, or query the monitor by DDC (where available) and return that list to the user.

But honestly, that's not designed properly yet. I had to hack in something to get a graphics mode up. ;)

Posted: Mon Dec 10, 2007 10:17 am
by Brendan
Hi,
mystran wrote:Yeah, well, there's basically two sensible options. The other is to have some enumeration of modes, and the other is having a generic timing calculation framework with the driver providing the available limits.
Make that 3 (hopefully sensible) options...

The other idea is to have a "resolution independant" video driver interface, e.g. where co-ordinates are specified as floating point values ranging from (-1, -1) to (1, 1). Then the video driver can automatically select and adjust the video mode using some sort of intelligent algorithm (e.g. based on the average time taken to generate previous frames) and/or the video mode could be directly controlled by the user, without applications, etc knowing or caring which video mode is being used.


Cheers,

Brendan

Posted: Mon Dec 10, 2007 10:57 am
by JamesM
Brendan: That sounds sensible, but it would mean I'd have to handle saving floating point registers on task switches which I *ahem* haven't quite got round to yet... :oops:

Posted: Mon Dec 10, 2007 1:12 pm
by mystran
Brendan wrote: The other idea is to have a "resolution independant" video driver interface, e.g. where co-ordinates are specified as floating point values ranging from (-1, -1) to (1, 1). Then the video driver can automatically select and adjust the video mode using some sort of intelligent algorithm (e.g. based on the average time taken to generate previous frames) and/or the video mode could be directly controlled by the user, without applications, etc knowing or caring which video mode is being used.
Come on. 120 dpi (or so) on a typical display is hardly enough to start playing the "pixels aren't important" game. Try editing bitmapped images with a TFT screen in non-native resolution and you see what I mean. And unless the real-world turns into vector graphics, there remains valid reasons for sampled snapshots of the same (also known as 'photos').

Granted, TFT internal interpolation might not be at the theoretical limit, but the better you want to make it, the more you'll be wasting processor power on it. Good quality 2D convolution isn't exactly cheap, and even if the processor power was available, with typical display resolutions (and I really mean the 'dpi' values here) you'll have to choose between blurred image, aliasing-artifacts, or visible ringing from the anti-aliasing filter.

Oh, and even if we accepted that the system was useless for purpose of editing photos, even many types of graphics with digital origin greatly benefit from the bitmap nature of images, because things like continuous time convolution isn't exactly easy on discrete computers. And convolution is after all the basis of a huge number of standard image manipulation algorithms (or photoshop filters, if you will).

I do think scalable vectorized GUIs are a good idea, but nontheless my position is that applications MUST be able to deal with raw pixels where that makes sense. You could provide the convolution engine in the graphics system if you wanted to prevent access to the pixel data, but even then you'd have to tell the actual resolution (in pixels per unit length) in order to enable an application to come up with a suitable filter kernel.

Finally, IMHO nothing of this belongs to the driver level, as the hardware really doesn't care. Current hardware can't even draw properly antialized lines (typical 3d multisampling counts are a joke for quality still-graphics) so it's not like you can offload any of it to the hardware anyway.

Posted: Mon Dec 10, 2007 10:37 pm
by Brendan
Hi,
mystran wrote:
Brendan wrote:The other idea is to have a "resolution independant" video driver interface, e.g. where co-ordinates are specified as floating point values ranging from (-1, -1) to (1, 1). Then the video driver can automatically select and adjust the video mode using some sort of intelligent algorithm (e.g. based on the average time taken to generate previous frames) and/or the video mode could be directly controlled by the user, without applications, etc knowing or caring which video mode is being used.
Come on. 120 dpi (or so) on a typical display is hardly enough to start playing the "pixels aren't important" game. Try editing bitmapped images with a TFT screen in non-native resolution and you see what I mean. And unless the real-world turns into vector graphics, there remains valid reasons for sampled snapshots of the same (also known as 'photos').
Let's make some assumptions first....

Let's assume you've got a font engine that can generate variable sized fonts. For example, you tell the font engine to convert the string "foo" into bitmap data that is N pixels high and M pixels wide and it returns a N * M mask that says how transparent each pixel is. Something like the FreeType project's font engine would probably work nicely.

Let's also assume that *somewhere* you've got code to load bitmap data from a file on disk, scale the bitmap data and convert from one pixel format to another (all web browsers have code to do this).

Lastly, assume that *somewhere* you've got code to draw rectangles (including lines), and code to draw font data from the font engine.

Now (except for the font engine, which should be a seperate service IMHO) put all of this code into the video driver (instead of reproducing most of it in each application) and let applications tell the video driver what to do using scripts.

For example, a simple application might tell the video driver:
  • - draw a blue rectangle from (-1, -1) to (1, -0.9)
    - draw a white rectangle from (-1, -0.9) to (1, 1)
    - draw the string "foo" from (-1, -1) to (-0.8, -0.9) using the font "bar"
    - draw the bitmap "/myApp/myIcon.bmp" from (-1, 0) to (-0.8, 0.2)
The video driver would convert the virtual coordinates into screen coordinates, draw the rectangles, ask the font engine to generate font data for "foo" and draw the resulting font data, load the bitmap from disk, scale the bitmap, convert the bitmap's pixel data and draw the bitmap.

For a modern video card some of this can be done by the graphics accelerator - the rectangles, and the bitmap scaling and colour conversion.

A good video driver would also cache the processed bitmap data so that it doesn't need to be loaded, scaled or converted to a different pixel format next time. Hopefully the preprocessed data would be cached in video memory so the hardware can do a "video to video" blit, instead of using the CPU to do it, and instead of caching the data in RAM (where the PCI bus is a bottleneck for "RAM to video" blits).

The same goes for the font data - maybe that can be cached in video memory and perhaps the video card's hardware can do alpha blending while it blits from video to video (most video card's can do this).

Of course even though this system is capable of handling most applications, even though (with basic hardware acceleration) it'd perform much better than a "framebuffer" interface, and even though you haven't got the same code for graphics operations duplicated in each application, it's still a very primitive interface...

What I'd want is for the video driver to accept scripts that describe a "canvas", where the script for a canvas can include other canvases, and where the video driver can cache the graphics data for any completed canvas (and the script used to generate that canvas) .

For example, imagine a GUI with 3 windows open. Each application would send a script describing it's main canvas (and any of the canvases included by the application's main canvas) to the GUI. The GUI would include the main canvases from these applications (and some of it's own canvases for things like the task bar, etc) into a script that describes the GUI's main canvas, and then send this large script (containing a hierarchy of canvas descriptions) to the video driver. The video driver might look at the scripts for all these canvases and determine which canvases have changed, and then get data for unchanged canvases from it's cache and only rebuild canvases that did change.

Of course I'd also want to add 3D to this - e.g. have "containers", and allow canvases to be mapped onto polygons in containers, and allow containers to be projected onto canvases. Then there's other graphics operations I'd want to add eventually (circles, curves, fog, lighting, shadow, etc), and support for 3D monitors (e.g. layered LCD and stereoscopic).

After all this, I'm going to have windows that bounce back when you click on them as if they're on springs, and desktop icons that can spin in 3D space like they're dangling from thin wires attached to the top of the monitor. You'll be able to rotate a window and look at it from the side and it'll have thickness and the raised buttons from the application will still look raised. If you close a window it might spin into the distance and disappear over a horizon (while never losing it's content until it vanishes).

I want weather themes. I want see snow fall inside the monitor and build up on top of windows and icons, and I want to see icicles form on the bottom of the windows. When I click on a window I want it to cause vibrations that shake some of the snow and icicles loose so that I can watch them fall to the bottom of the screen, bouncing off other windows and icons and knocking more snow and ice free, until all the snow and ice reaches the bottom of the screen and slowly melts away.

I want to be able to have windows and icons at strange angles everywhere and then I want fly the mouse pointer like a glider between them. I want to skim across the surface of a status bar pulling up at the last second before crashing into the raised window border before quickly flipping over the windows edge and landing peacefully on the back of the window.

When I get bored I want to swap the mouse pointer for a bat and have a steel ball that bounces around and crashes into windows, making them tilt and rotate and collide with other windows. And perhaps, occasionally, a window will shatter sending shards in all directions.

I want a sun that slowly shifts from the left to the top to the right during the day. You'd never actually see the sun but you'd know where it is by the shadows from windows, icons, buttons and the mouse pointer. People would know it's time to go home from work when the mouse pointer starts casting a long shadow stretching across the desktop towards the left of the screen. At night there can be a soft ambient light and windows and the mouse pointer can have their own faint internal glow so that moving a window causes subtle changes to the shadows cast by other windows onto the desktop.

Maybe this is all just a dream, but it's what I want my video drivers to be capable of eventually.

So, what do you want your video drivers to be capable of eventually? A flat 2D framebuffer with no way to make use of hardware acceleration, no way to do any 3D graphics and no way to handle 3D monitors? That's nice, but even the Commodore64 added sprites to that... ;)


Cheers,

Brendan

Posted: Tue Dec 11, 2007 7:32 am
by mystran
I'd be perfectly happy with 2d flat framebuffer, some offscreen memory, and accelerated blits and alpha blending.

3D is a separete issue.

I hold my position (which you didn't comment on) that monitor dot pitches are nowhere near sufficient for resolution independence, and hence applications need access to raw pixels anyway. Or are you going to tell the AD "but this is the way it should be done."

Posted: Tue Dec 11, 2007 8:39 am
by Colonel Kernel
@Brendan: I hope you're going to have some focus groups before you implement all that gratuitous eye-candy. ;)
mystran wrote:I hold my position (which you didn't comment on) that monitor dot pitches are nowhere near sufficient for resolution independence, and hence applications need access to raw pixels anyway.
That's sort of a chicken-and-egg problem. Why would manufacturers be motivated to make high-dpi monitors if software doesn't take advantage of it?

I predict that we'll start seeing super-high resolution displays around the time OS X 10.6 is released. :)

Posted: Tue Dec 11, 2007 10:17 am
by Brendan
Hi,
mystran wrote:I'd be perfectly happy with 2d flat framebuffer, some offscreen memory, and accelerated blits and alpha blending.

3D is a separete issue.
If the application provides a bitmap in RAM, then hardware acceleration can't be used by the application to create that bitmap in RAM, and the video driver has no choice but to do a blit from RAM to display memory. The entire "application provides a bitmap in RAM" idea completely defeats the purpose of having graphics accelerators in the first place.
mystran wrote:I hold my position (which you didn't comment on) that monitor dot pitches are nowhere near sufficient for resolution independence, and hence applications need access to raw pixels anyway. Or are you going to tell the AD "but this is the way it should be done."
To be honest, I didn't comment on it because I don't see how it can be a problem.

Fonts are rendered without resolution independance, so you don't get small blury blobs (unless the font is extremely small). For photo editing people don't care about pixels. For bitmap editing people normally zoom in until they're looking at decent sized rectanges (where each pixel in the bitmap is at least half the size of the mouse pointer) to make the pixels easier to click.

It would suck for a low resolution display (e.g. an old mobile phone), but for desktop/server if you can't at least get 640 * 480 (with 16 colours) then you should probably consider using telnet/ssh anyway... ;)


Cheers,

Brendan