graphics speed?

eddyb · Post by **eddyb** » Mon Dec 15, 2008 12:46 pm

well, i'm trying to make some GUI, but i have a question: on what depends the speed of the graphics? is any speed difference between an implementation which checks if an byte is equal to the one it have to write there and a simple one which just put there the data?
thanks, eddyb.

Combuster · Post by **Combuster** » Mon Dec 15, 2008 12:55 pm

it is a matter of latency - if you read video memory, check if a byte is set, then write if it is, then you have to wait for a full PCI cycle to complete, before you can decide, and you'll have another PCI cycle afterwards (on which you don't need to wait).
Only writing halves the PCI cycles needed since all the reads are gone, gives less delays since you can do other things while the PCI bus is busy. The processor might even be able to collapse multiple writes into a single PCI cycle and reduce the amount of cycles by an even more significant factor.

eddyb · Post by **eddyb** » Mon Dec 15, 2008 1:02 pm

well, my pc has onboard video with shared memory, but maybe it's the same pci thing...
now i use qemu, and i think it's slower than real video, but the real one have no 24bpp mode, and i haven't implemented 32bpp in my GUI.
well, maybe it's because of my redrawing: I draw the desktop(a big 1024*768 green rectangle) and a window to a buffer, then i copy all the buffer to video memory.

TheDragon · Post by **TheDragon** » Thu Dec 18, 2008 12:25 am

I'm still working on an UBER-fast PutPixel method for my OS/games/other projects (in asm), and I mean fast by 1993's standards (around 60 clock ticks(most of my graphics programming resources are from the 90s)). I haven't coded anything for a 24bit mode, though. I suppose you take a stosb and a stos word with some fancy bit shifting, right?

pcmattman · Post by **pcmattman** » Thu Dec 18, 2008 1:46 am

Reading video memory is always extremely slow, writing is always very fast. Checking if a byte is equal before writing will be slower than writing, unless you check the byte in a double buffer then write the double buffer to video memory.

So what you typically do is keep track of changed regions using some form of bookkeeping in your kernel (in system RAM), and only write out what's changed to the video device. Look up dirty rectangles.

Dex · Post by **Dex** » Thu Dec 18, 2008 9:51 am

You should also look to other methods to speed graphic's up, for example by setting MTRR to write combind, gives me a 35% increase in FPS on all my test PC's.

pcmattman · Post by **pcmattman** » Thu Dec 18, 2008 5:36 pm

Also, you can look into the older resources (such as those TheDragon is referring to) because back then optimizing graphics was something people were awfully good at. Sure, we have faster CPUs and dedicated GPUs for all this (I assume you're not using the GPU directly for hardware acceleration though, using VESA?) but when working with graphics you want to squeeze out every little bit of performance.

well, maybe it's because of my redrawing: I draw the desktop(a big 1024*768 green rectangle) and a window to a buffer, then i copy all the buffer to video memory.

Good - using a double buffer is the popular way of handling graphics to stop tearing. However, you don't want to rewrite the entire buffer at each redraw. For your example, 16-bit 1024x768 will be 1,572,864 bytes - 1.5 MB! As I said before, dirty rectangles are what I would suggest in addition to the other advice here. Basically, you keep a list of all changed regions since the last redraw, and only write the new regions to video memory. You *still* write as before to the double buffer (because that's in RAM and it's extremely fast to read/write), but you use this list of regions to choose what to draw.

LoseThos · Post by **LoseThos** » Thu Dec 18, 2008 6:09 pm

I wanted simpler code, so I update everything and use 640x480x4 bpp. The worst case for everyone is full screen video games which change everything each frame. There's no benefit to keeping track of rectangles over updating the whole screen when it comes to video games like FPS's. I just update everything every frame. Obviously, mine will always be slower than hardware accelerated graphics and I don't even attempt to compete with that, so I use 640x480x4 bpp.

I managed to apply other cores to the job of rendering by having a layer for each core that gets merged. If you want, you can divide objects on the screen to be rendered with different cores. The depth buffer is not shared, though, so you kinda have to know how the image can be divided. I did a multicore flight simulator you can see a video of. To make the video, I made a task capture and store the frames to disk, so the load is higher than when running without capturing and I use Window's Movie maker which has a limit of 8 frames/second. It's still a little choppy when running normally, but not as bad as the video.

Here's the video: http://www.losethos.com/flightsim.html

The gaps are just because I was too lazy to generate panels to cover mountain topologies very thoroughly.

slasher · Post by **slasher** » Fri Dec 19, 2008 3:50 am

I am in the process of reviving my operating system and the means by which I've chosen to do that is via working on graphics (I enjoy that a lot

)
It shouldn't be too hard to get fast graphics rendering if we take a specialised approach.

I had a post on implementing OpenGL and am going to get my graphics layer working as fast as possible using ALL SPECIAL PROPERTIES PCs have to offer eg MMX, SSE, 3D!Now etc.

The aim of the graphics layer will be to get 16,24 and 32 bits LFB video modes running at/close to 30/60 frames per sec.

I'll share what I learn when I have that done, You could get ideas from that.

eddyb · Post by **eddyb** » Thu Dec 25, 2008 4:34 am

it's terrible slow. mouse has 1px/s(pixel per second) speed.
there's how it works:
it has the video mem. and a double buffer.
in a loop, it draws the entire desktop(a screen-size bitmap), the windows and what it has to draw(some text, the mouse cursor, etc) on the double buffer.
then, it copies the double buffer to the video mem.
how can be done some optimizing?
I've heard of dirty lines, but what are they?
thanks, eddyb.

Combuster · Post by **Combuster** » Thu Dec 25, 2008 7:07 am

There have been mentioned several alternatives this thread, other (recent) threads, and your proposal looks like it hasn't been googled either. Not worth an answer

ru2aqare · Post by **ru2aqare** » Thu Dec 25, 2008 1:40 pm

shiner wrote:I've heard of dirty lines, but what are they?
thanks, eddyb.

"Dirty lines" is the name of a technique. Basically, it works by recording which scan lines were updated in the double buffer, and only those lines are then transferred to the video memory. The idea is that if you move the mouse only, only a few lines will be updated (like 30 instead of 600 or more), thus you can update the screen faster.

Brendan · Post by **Brendan** » Thu Dec 25, 2008 11:37 pm

Hi,

shiner wrote:it's terrible slow. mouse has 1px/s(pixel per second) speed.
there's how it works:
it has the video mem. and a double buffer.
in a loop, it draws the entire desktop(a screen-size bitmap), the windows and what it has to draw(some text, the mouse cursor, etc) on the double buffer.
then, it copies the double buffer to the video mem.
how can be done some optimizing?
I've heard of dirty lines, but what are they?
thanks, eddyb.

A graphics systems has 3 basic states:

Waiting for something to change
Redrawing as little as possible (e.g. depending on what changed)
Updating the least amount of video display memory necessary (e.g. depending on which areas were redrawn)

This could be done in a simple loop, but you can do "temporal optimizations" - skip work that will only be visible for a short time (and won't be noticed by the user anyway). The basic idea here is to limit the "frames per second" you generate, so that (for e.g.) if the video mode has a 60 Hz refresh rate you never generate more than 60 frames per second. If you're waiting for something to change and something does change, then start a timer and wait for a short amount of time (e.g. "(1 / refresh_rate) / 8" seconds) in case more things change, so that if several things change at the same time you only do the rest of the work once (instead of doing the rest of the work several times). Then, when you've updated display memory remember what time it is and postpone all updates until "1 / refresh_rate" seconds has passed.

Eventually, you should end up with an image for the background and an image for each "thing" (e.g. window) that's in front of the background; and eventually you'll need to combine all these seperate images into a single image (e.g. your double buffer). For 3D pipelines there's a technique called "scan line conversion" that would work well here. For each horizontal line, you build a sorted linked list that describes the contents of the horizontal line. For example, if the horizontal line should look like this (where '.' is the background, '1' is pixel data from the first window and '2' is pixel data from the second window):

Code: Select all

........111111112222222222.......

Then the linked list for this line would have one entry for the first piece of background, one entry for the left edge of the first window, one entry for the left edge of the second window, and one entry for the second piece of background; where each entry contains the source of the run of pixels (which bitmap they come from), the starting (x, y) position of the run of pixels within the source bitmap, and how many pixels are in the run. Once you've got the linked lists you'd use them (eventually) to update the double buffer, which completely avoids updating the same pixel (in the double buffer) more than once.

For a GUI, these linked lists don't change unless windows are moved or resized, so you only need to update them occasionally (not for every frame) and most of the time you'd reuse the previous linked list. Also, you don't need any pixel data to generate these linked lists - you only need to know where each window is (e.g. the (x, y) of the top left corner of the window and the width and height of the window). While you're generating these linked lists, you can give each window and the background a counter that keeps track of how many linked list entries refer to that window (and the background). If the counter for a window (or the background) is zero, then that window (or the background) isn't visible.

Once you've done this, the next step is to redraw the bitmaps for the background and the bitmaps for each window, but you'd skip everything that isn't visible and recycle previous bitmaps if they didn't change.

Now you use those linked lists (and the bitmaps) to generate the double buffer. While doing this you can generate a set of "line changed" flags by comparing the old data in the double buffer with the new data you're putting in the double buffer. Finally, you copy the double buffer to video display memory, while skipping any lines that were the same as last time.

Also, it's a good idea to use these "line changed" flags much earlier. When you first start redrawing things, clear all the line changed flags. If any window is moved or resized, or if any window is updated, then set the corresponding line changed flags. Then you can skip entire lines when you're updating the double buffer, and for lines that aren't skipped you'd check if the pixel data changed and clear the corresponding line changed flag if the line didn't change.

Lastly, there's the mouse pointer. Before you copy the double buffer to display memory you copy any pixels that are underneath the mouse pointer somewhere else, then draw the mouse pointer in the double buffer (and set the corresponding "line changed" flags), then copy the double buffer to display memory, then restore the pixels under the mouse pointer. Of course if the only thing that changed is the mouse pointer, then you could skip everything else and only do this part.

Note 1: The "line changed" flags could be something else - for e.g. you could have one flag for each horizontal line, but you could have one flag for each dword, or one flag for each 64-byte area, or something else.

Note 2: This could be extended by taking a little more from 3D pipelines, to allow windows to be rotated about the Z axis and scaled. For this you'd need to do clipping and translation, then do the scan line conversion (build those linked lists that describe the contents of each horizontal line); and you'd need to add more details to the linked list entries so you can calculate where the next pixel comes from (e.g. X and Y step). You could probably extend this further by allowing windows to be rotated about the X axis and Y axis, but this complicates things a lot because Z isn't constant anymore. There's probably other effects you could add that aren't as complicated though (e.g. transparency or alpha blending, fog/darkness).

Note 3: It'd be wise to forget about windows and think in terms of "canvases" or "bitmaps" or something. It'd be nice if most of the code to update the double buffer is also used to draw each window. For example, have a "update destination bitmap from source data" function.

Note 4: Don't forget to implement all this with hardware accelerated video in mind, so that if someone (eventually, maybe, one day) writes a video driver that supports hardware acceleration then the graphics subsystem can use the video driver's hardware acceleration. This mostly means that most of the above needs to be done by the video driver on behalf of the applications, GUI, etc; and not done by applications, GUI, etc themselves. Also, it'd be very nice if your video driver is multi-threaded, so that on a multi-CPU system you get multiple CPUs sharing all the work, especially if there's no hardware acceleration.

Cheers,

Brendan

eddyb · Post by **eddyb** » Fri Dec 26, 2008 5:19 am

first, thanks to brendan

got an idea of some optimizing system, just it's more "dirty pixels" than dirty lines

let's say for each widget, I have a unique number, a handle.
I have a WIDTHxHEIGHT pixel-handle bitmap, with a handle for each pixel(px-hnd-table[pixel] is the handle of widget from what is the pixel).
when a new widget is created, or a widget is moved, it adjust that px-hnd-table, and also, it writes the pixels that were modified to a list.
to redraw, it just renew the pixels from the dirty pixels list in the double buffer, and then, in the real screen(anyway, it seems there is no need for a double buffer now, when i change what it needs to; i think you know, on Windows, the redrawing can be seen)

OSDev.org

graphics speed?

graphics speed?

Re: graphics speed?

Re: graphics speed?

Re: graphics speed?

Re: graphics speed?

Re: graphics speed?

Re: graphics speed?

Re: graphics speed?

Re: graphics speed?

Re: graphics speed?

Re: graphics speed?

Re: graphics speed?

Re: graphics speed?

Re: graphics speed?