OSDev.org

Posted: **Fri Sep 21, 2012 6:13 am**

Brendan wrote:Hi,
freecrac wrote:
Brendan wrote:Writing native video drivers is a nightmare - best to leave that for "version 2" of your OS...
Yes, specally if there is no documentation about how we can do it.
Samples for a nacked skeleton of a sourcecode for a future driver for DOS exist and can be found in the web.

But where we can find more deeeper informations about our hardware and how we can use all function of it without to involve a (close source) driver?
You can find some information for some cards (mostly Intel and AMD), and Linux does have some source code for almost every video card (e.g. based on reverse engineering where information isn't available).

Maybe i am not good enough to understand a C-Code.
I have looked many times and a lot of hours, but i could not find informations inside of the linux sources for the secondary display device.
Are you sure that those information in the linux sources exist?

However, if you're considering this then you probably fail to grasp the complexities involved. For example, implementing a modern video driver typically includes implementing some sort of "shader language" compiler for that video card.

But additional there must be a routine to handle modeswitching of both display devices.
Looking to the windows driver from ATI and from NVIDIA in the past we can use one driver for a lot of card generations and not only for one card.
So i think it is possible there is the same procedur for modeswitching for a lot of cards from the same manufacturer too, because i can not believe that they fabricate for every comming new card modell a totaly new implementation for that.

freecrac wrote:And why those old informations about bare 2D-functions are not exist for public usement, how can anybody be a problem with this for those big manufacturers if they publish these informations?
I honestly don't know why companies like NVidia fail to provide any information. They say "trade secrets" but I don't believe them, and it's more likely that they're just too lazy to spend ages documenting things properly (and handling mistakes in the documentation, updates/revisions, etc).

But nobody wish that they spend a lot of time and money for that, but i think if there is only one stuff Member of each manufacturer for to write those documentation, it is not so expensive and will not so much raise their running charges.
And i think their so called "trade secrets" are also under patent protection and if somebody try to sell those technicals, then they will become a lawsuit for to stop those comercial on the market. Otherwise there is no great market to become a dangerous situation for the few big global players in this sector.

freecrac wrote:And if the manufacturers do not longer want to provide all those old RealMode stuff, why do they put a vbe-bios on their modern cards?
For backward compatibility; and because the BIOS requires something it can use during POST to display errors, etc to the user.

For backward compatibility we need only an older VGA-BIOS. And the VBE versions are also not realy compatible among themeself.
VBE 1 use only the older outdated modelist/modenumbers and newer cards with a VBE2 or VBE 3 bios maybe come with different modenumbers.
Example: These older VBE 1 modenumbers do not exist on my newest Sapphire 7950(GPU from ATI) card with a VBE 3 bios.
112h - 640x480 16.8M (8:8:8)
115h - 800x600 16.8M (8:8:8)
118h - 1024x768 16.8M (8:8:8)
11Bh - 1280x1024 16.8M (8:8:8)
And there are also no other VBE-modes with 24 bits per pixel, there are only modes with 8, 16 and 32 Bits per pixel.
Also in the the resolution of 640x480, 800x600, 1024x768 and 1280x1024, but with other modenumbers.
(No really a problem for me, because i do not like to use 24 Bits per pixel and i can not understand why those modes even longer exist today on cards with more then enough videoram for to use 32 bits per pixel with all resolutions of the card.
24 Bit per pixel was only usefull a long time ago, example with my ET4000 card with only 1 MB videoram, for to use 640x480x24 at the first time of programming true color modes.)

Eventually the BIOS will die, and video card manufacturers will put "UEFI byte-code drivers" in their card's ROM instead.

I look at this with mingled feelings, but i am not familar with this. But i read there is a compatibility mode for to start in the 16 Bit RM.

freecrac wrote:Edit: I search for the vbe modelist inside of the bios starting at C000:0, but i could not find it there. Maybe the modelist is stored in a different form.
I have no idea why you'd want to do that in the first place - just use the correct VBE function correctly.

It was only my idle curiosity for to do that, also looking for the name of the manufacturer and the modell number for to identify a card, thinking of a situation when somebody do not know it.
But i never try to scan the PCI-Bus for comparing the content with a (never ending?) list of devices, or for to control them particular over it.

Dirk

Posted: **Fri Sep 21, 2012 6:46 am**

Hi,

freecrac wrote:
Brendan wrote:You can find some information for some cards (mostly Intel and AMD), and Linux does have some source code for almost every video card (e.g. based on reverse engineering where information isn't available).
Maybe i am not good enough to understand a C-Code.
I have looked many times and a lot of hours, but i could not find informations inside of the linux sources for the secondary display device.

Check the "src/linux/drivers/video" directory.

Be warned that video on Linux is an ugly mess. For a specific video card there isn't one "video driver" - instead the kernel has mode setting and not much else, and things like OpengGL are implemented in other places. For a reasonable overview, see this article about the Linux graphics stack.

freecrac wrote:Example: These older VBE 1 modenumbers do not exist on my newest Sapphire 7950(GPU from ATI) card with a VBE 3 bios.
112h - 640x480 16.8M (8:8:8)
115h - 800x600 16.8M (8:8:8)
118h - 1024x768 16.8M (8:8:8)
11Bh - 1280x1024 16.8M (8:8:8)

Nobody ever said that a video card has to support all these modes. For example, (in theory) a video card could only support one strange mode (e.g. "1234*987 with 48-bits per pixel") and nothing else, and still comply with any version of VBE.

Also, for VBE2 and later, the "standard" mode numbers are deprecated (e.g. you can't/shouldn't assume that "mode 0x112" is 640*480 and it could be anything, or nothing).

Cheers,

Brendan

Posted: **Fri Sep 21, 2012 2:17 pm**

Brendan wrote:It's also probably the wrong way to approach the problem. Normally you'd want to get a list of all video modes and filter out any modes that the OS doesn't support (and then determine the "best" mode from whatever is left); instead of asking for one specific mode and then giving up if that one mode isn't supported.

I agree. I filter out all palette modes (because they only have historical meaning), and all bank-switched modes (because you don't want to call V86-mode regularily from the video-code). I only keep LFB-addressable modes with 15, 16, 24 and 32 bits.

Posted: **Fri Sep 21, 2012 3:07 pm**

Hi,

rdos wrote:
Brendan wrote:It's also probably the wrong way to approach the problem. Normally you'd want to get a list of all video modes and filter out any modes that the OS doesn't support (and then determine the "best" mode from whatever is left); instead of asking for one specific mode and then giving up if that one mode isn't supported.
I agree. I filter out all palette modes (because they only have historical meaning), and all bank-switched modes (because you don't want to call V86-mode regularily from the video-code). I only keep LFB-addressable modes with 15, 16, 24 and 32 bits.

I keep 256 colour modes (but set the palette once during boot rather than dynamically), and 4 colour "planar" modes if VBE says it's "VGA compatible".

I also keep bank switched modes if everything fits in one bank, or if everything fits in 2 banks and the "display windows" can be configured as contiguous (e.g. so that two 64 KiB windows looks like one 128 KiB area).

Without LFB; with one bank this gives me 320*200 with 256 colours and (up to) 800*600 with 4 colours; and with 2 contiguous banks it can go up to 1024 * 768 with 4 colours.

Of course this is mostly intended for very old video cards that don't support LFB (e.g. "VBE 1").

Cheers,

Brendan

Posted: **Fri Sep 21, 2012 5:52 pm**

Brendan wrote:I keep 256 colour modes (but set the palette once during boot rather than dynamically), and 4 colour "planar" modes if VBE says it's "VGA compatible".

Why to do this in such a way? It looks like it is much easier to set the palette dynamically via registers (one of the easiest things to do in a VGA). A decent system should allow knowing the fact of whether it is currently possible or not (whether video hardware is in legacy VGA-compatible mode).

Otherwise, wouldn't it under-use the (legacy yet always needed) graphical capability of the OS in the long run, and wouldn't it be impractical if one has only limited, non-accelerated use for VESA, but very complete "native" VGA capabilities? At least Windows seems to be able to do this dynamically, and enter and leave "VGA mode" as it needs.

In the long run, VESA should still be useable at least via full x86 code emulation to enter and leave VGA-compatible mode dynamically if the system fails to have proper video drivers, wouldn't it?

Brendan wrote:I honestly don't know why companies like NVidia fail to provide any information. They say "trade secrets" but I don't believe them, and it's more likely that they're just too lazy to spend ages documenting things properly (and handling mistakes in the documentation, updates/revisions, etc).

They might "forcefully make it true", but it wouldn't happen if they made their internal documentation, source code and overall development fully transparent in the Internet. Then let the Internet developer community sort out an useable documentation for it and debug everything (and forewarning everyone that they aren't responsible for the results of using the "unofficial fork versions"). Of course, it won't happen any time soon and thus it isn't very helpful or practical for us to expect it, although by doing so nobody would lose anything (things including performance could tend to standarize and normal users who don't know what it all is wouldn't be influenced in their opinions by open sourcing in a way to dramatically alter the current market share).

freecrac wrote:Yes, specally if there is no documentation about how we can do it.
Samples for a nacked skeleton of a sourcecode for a future driver for DOS exist and can be found in the web.

But where we can find more deeeper informations about our hardware and how we can use all function of it without to involve a (close source) driver?
And why those old informations about bare 2D-functions are not exist for public usement, how can anybody be a problem with this for those big manufacturers if they publish these informations?
And if the manufacturers do not longer want to provide all those old RealMode stuff, why do they put a vbe-bios on their modern cards?

Edit: I search for the vbe modelist inside of the bios starting at C000:0, but i could not find it there. Maybe the modelist is stored in a different form.

Dirk

Just think about the fact that major, formal operating systems like Linux have been written by thousands of people, and that they have required a huge amount of effort, and for its developers to know just about every single specification, standards document, algorithm and know-how that there is for it to work.

It means that we would need to dedicate that same amount of time and development effort and follow the same process, and read and understand just about every bit relevant of documentation, standard or not, as well as the actual source code in existence, for us to be able to have the required knowledge and be able to offer something as universally useful (as Linux).

The more you read and understand, the more you will achieve the things you are asking about now.

From all of the so-far, little attempts I have done that have resulted in a true success in learning and implementing things in OS development, I can say that you must never rely on guesswork primarily.

Even if it seems more time consuming (which actually is NOT) if you don't know enough, there is no other choice than having to read technical documents and manuals word by word for it to fill your "spare time", instead of wasting that time merely guessing or writing equally "ignorant" trial-and-error-based implementations, too many times in a row.

In what other way do you expect to get to know about minute details, for instance, for how every aspect of an x86 CPU behaves and must be programmed?

Asking in a forum or searching with Google only offers partial effectiveness to master and actually find out those details.

If you have time and you know another language other than English, an excellent learning technique is to translate the key technical documents into that second language. To translate it you must understand it, and by doing this you have to pay attention to details that are otherwise invisible if you aren't knowledgeable, and you can learn practically everything a document contains in this way in the first pass (some 10 weeks for a 1000+ page document, if you type and edit fast).

It seems to me that from a developer point of view, current hardware is very disposable (because you are only given the end product but not the knowledge that would allow you to perpetuate a standard platform and standard software on your own), and old hardware has much more value than current hardware. You can still learn about 2D primitives in software and the like, but of course as I said, it would take you full time, and no matter how knowledgeable you are, if you don't work in a big team in a real company or for something like Linux, you will only achieve a hobby-quality level.

Then it is necessary to think about how to make such thankless lone development effort be worth it and make sure to do it and publish it in a way that will be useful and will add to ease the current software development world as it is in the wild (and make some fair profit out of it to make it possible to continue in your projects).

If you have time, and if you can provide knowledge and/or end-user software, and if you can read/translate all of those technical documents and master them, then that's all there is to it (a process that will take years if not your whole life).
__________________________________________
__________________________________________
__________________________________________
__________________________________________

The last thing I can say about this is:

Develpment, or technological/mathematical/scientific work isn't really hard. If you can "learn a trick", one single development trick (algorithm, etc.), and perform it with excellence and without errors, you have made it. If it took you some 3 or 6 weeks but now understand it, you are done.

Now, the problem is that now there are other millions of those tricks you need to learn next, and figure them out and master them takes a lot of time, study, asking, reading, trial and error, coding, etc. Not so much individually but when combined in a complex system, which for practical purposes is always the case.

What makes it difficult is the fact of having such a huge amount of tricks to learn (millions in existence in the universe), and it would be easier as long as you simply have a way to make a living while having as much time as needed to work in this permanently every day.

It is also supposed/sure that nobody can master everything, so what one does must be chosen with care for it to make useful sense and allow progress.

Otherwise, it is only possible to learn the tricks that are required in a job, and that's why it is important to find a perspective that allows using the knowledge about OS development and extend it to other areas that are practical and that demonstrate to yourself that you are doing something important and useful with your time, and that one can usefully offer that to others.

Posted: **Sat Sep 22, 2012 12:01 am**

Hi,

~ wrote:
Brendan wrote:I keep 256 colour modes (but set the palette once during boot rather than dynamically), and 4 colour "planar" modes if VBE says it's "VGA compatible".
Why to do this in such a way? It looks like it is much easier to set the palette dynamically via registers (one of the easiest things to do in a VGA). A decent system should allow knowing the fact of whether it is currently possible or not (whether video hardware is in legacy VGA-compatible mode).

My original sentence may have been misleading - I meant "I keep 4 colour "planar" modes if VBE says it's "VGA compatible"; and 256 colour modes (but set the palette once during boot rather than dynamically)" (and not "I keep 256 colour modes (but set the palette once during boot rather than dynamically) and 4 colour "planar" modes; if VBE says it's "VGA compatible".").

Basically, for 256 colour modes I don't check if the mode is VGA compatible and don't assume that the VGA palette registers exist or behave the same as they would in a VGA card if they do exist.

Note that the VBE specification clearly says:

VBE wrote:Check if VGA Compatible Before Directly Programming the DAC
Another area of concern is programming the color palette in 256 color modes. Once again the same problem occurs when programming the palette for NonVGA controllers; the VGA palette registers no longer exist and attempting to program the palette via these registers will simply do nothing. Even worse attempting to synch to the vertical or horizontal retrace will also cause the system to get into an infinite loop.

Hence if you need to program the color palette on a NonVGA controller, you must use the supplied VBE 2.0 and above palette programming routines rather than programming the palette directly. Make sure you check the NonVGA attribute bit as discussed above, and if a NonVGA mode is detected you will have to program the palette via the standard VBE 2.0 and above services.

Because I support "NonVGA controllers" (and because I don't check the "VGA compatible" flag and treat VGA compatible 256 colour modes differently to NonVGA compatible 256 colour modes), I can't touch any of the VGA registers (including the DAC registers).

~ wrote:Otherwise, wouldn't it under-use the (legacy yet always needed) graphical capability of the OS in the long run, and wouldn't it be impractical if one has only limited, non-accelerated use for VESA, but very complete "native" VGA capabilities? At least Windows seems to be able to do this dynamically, and enter and leave "VGA mode" as it needs.

Windows probably uses virtual8086 mode (or an emulator in long mode), and/or the protected mode VBE interface (if available), to access the VBE functions correctly. I have no intention of supporting virtual8086 mode or implementing an emulator, or supporting either of the protected mode VBE interfaces.

Cheers,

Brendan

Posted: **Mon Sep 24, 2012 8:18 am**

I tried a lot of things I found on the internet for actually drawing things with vesa/vbe, but it's very difficult.. does anyone know an beginner guide for drawing (and also setting a good video mode)? I just can't manage to find something like that..

Posted: **Mon Sep 24, 2012 8:44 am**

For beginners, there are two methods:
1: Use DOS
2: Be a real noob and steal code

Posted: **Wed Sep 26, 2012 7:45 am**

Dhaann wrote:I tried a lot of things I found on the internet for actually drawing things with vesa/vbe, but it's very difficult.. does anyone know an beginner guide for drawing (and also setting a good video mode)? I just can't manage to find something like that..

If you boot an older DOS, or a if you boot your own OS, then it is possible to switch into a vesa/vbe graphic mode, maybe with a high resolution.
For example with my newest card a Sapphire 7950(VBE3-Bios; GPU=ATI) and with my older Colorfull GTX295(VBE3, NVIDIA) i can use the native widescreen resolution of 1920x1200@60 of my 28" LCD from HansG (16:10 aspect ratio) with 32 bit per pixel.

With my older Geforce 4 TI4200 (VBE3, AGPx4, 64MB) and with two 19" CRT monitor from Samsung and from Samtron with a capacity of 96khz/160hz and with a prefered resolution of 1280x1024@85hz(according to their EDID) i can use a resolution of 1024x768x32@100 hz refershrate. (A little DOS-demo with a sourcecode written for MASM5 can be download from my homepage on http://www.alice-dsl.net/freecracmaps/Tool/Neutrip.zip)

In the first step we have to try to get the VBE controller information with the function 4F00h in a specified buffer of 512 bytes.
If no error ocur, then our buffer will be filled with the requested informations. Then we can check the major versions number of the vbe bios.
If we can find a version 2 or a version 3, then we can use a pointer to the modelist of the bios to get each modenumber.
(Each number is stored in a word and the end of list is marked with a number of FFFF.)

In the next step we have to use function 4F01h to get some more mode-specific information (of each modenumber from the step above) in a second buffer of 256 bytes.
Inside of this buffer we can find information about the resolution of a mode and how many bits per pixel we have to use in this mode,
we can check if we can use the linear frambuffer(LFB) and if the address is not 0,
VBE3 only: we can check if we can use hardware triple buffering and/or if we can use our own CRT parameter table if we switch into this specific mode

Hint for the address calculation:
One part of the scanline is maybe outside of the visible view and the scanline can be longer as the width of the horizontal resolution.

For more information search for the vbe3.pdf or download it costfree from the public section of http://www.vesa.org/ (need register/login).
For more specific problems ask again and try to describe where is exactly your problem.

Dirk

Posted: **Wed Sep 26, 2012 1:03 pm**

Hi,

Dhaann wrote:I tried a lot of things I found on the internet for actually drawing things with vesa/vbe, but it's very difficult.. does anyone know an beginner guide for drawing (and also setting a good video mode)? I just can't manage to find something like that..

VBE has a "get mode information" function. This will tell you the address of the LFB, the number of bytes between rows of pixels, the horizontal and vertical resolution, and the format of pixels (bits per pixel, and which pixels are for red, green and blue).

I'd use the information about the format of pixels to filter out strange stuff, so that you only have to deal with 5 or 6 different pixel formats:

4-bpp planar (only if VBE says it is "VGA compatible"; where horizontal resolution is a multiple of 32)
8-bpp with palette (where horizontal resolution is a multiple of 4)
15-bpp "5R:5G:5B" (where horizontal resolution is a multiple of 2)
16-bpp "5R:6G:5B" (where horizontal resolution is a multiple of 2)
24-bpp "8R:8G:8B" (where horizontal resolution is a multiple of 4)
32-bpp "8R:8G:8B"

Then I'd use the horizontal and vertical resolution of the selected video mode to allocate a "32 bits per pixel" buffer in RAM (you'd need "horizontal * vertical * 4 bytes"). I'd do all drawing in this buffer in RAM. This means you only need to care about the video mode's pixel format when you blit data from your buffer in RAM to display memory.

For 32-bpp modes you'd just copy one line at a time to the correct place in display memory, as the video mode uses the same pixel format as your buffer and everything will be aligned.

For 24-bpp modes a pixel is 3 bytes, which causes alignment problems. To fix that you read 4 pixels from your buffer (12 bytes) and juggle the data a little (using bit shifts), then write those 4 pixels as 3 aligned 32-bit writes. Other than that it's fairly easy.

For 16-bpp modes you need to do a bit more bit twiddling. You'd read 2 pixels from RAM; then use bit shifts to discard the lowest 3 bits of red and blue and discard the lowest 2 bits of green; then use more bit shifts to combine the red, green and blue from both pixels into a single 32-bit value and do an aligned 32-bit write to display memory.

For 15-bpp modes it's basically the same as 16-bpp, except you discard the lowest 3-bits of green and the bit shifting is slightly different.

For 8-bpp modes, how it's done depends on what you did with the palette. I normally set the palette to "2-bits of red, 3-bits of green and 2-bits of blue". In this case you can read 4 pixels, discard the lowest bits of the red, green and blue, then write a 32-bit value to display memory (containing a byte for each of the 4 pixels). If you're dynamically reprogramming the palette, then you'd have to find the 256 "best" colours first (this is left as an exercise for the reader!).

For 4-bpp planar modes, display memory is split into 4 planes, where each plane has one bit of each pixel. The first step is to convert the 32-bpp data into "1 bit per pixel, per plane" format. Create 4 buffers in RAM (one for each plane), then, for each pixel find the closest 4-bpp colour (search a 16-entry table for the best match) and put each bit of the result into the correct place in each of the 4 buffers in RAM. Once this is done the rest is simple: for each plane, select the plane using VGA registers, then copy the plane's data one row at a time using 32-bit reads/writes ("rep movsd").

Once all of the above is working you can start thinking about more advanced stuff, like avoiding writing data to display memory that didn't change, using multiple threads/CPUs to speed it up more (e.g. 4 CPUs doing a quarter of the data each), using MMX/SSE/AVX, dynamically reprogramming the palette in 8-bpp modes, dithering, hue shifting for colour blind people, etc. You can also consider changing the format of pixels in your buffer in RAM to something else (e.g. HDR support, colour space independence, etc).

Drawing things into the buffer in RAM should be easy. I'd start with a routine to draw a rectangle like "do_rectangle(uint32_t colour, int x, int y, int height, int width)", which is little more than a "for each line { rep stosd; }" loop.

Cheers,

Brendan

Posted: **Thu Sep 27, 2012 3:42 am**

Brendan wrote:For 32-bpp modes you'd just copy one line at a time to the correct place in display memory, ...

And what is happend if we copy more than one line at a time from our Ram to the display memory?

I read about that if we use AGP cards, then we become only PCI speed, of no AGP acelleration, AGP chipset cache, or write combining(WC) is in use.
Do we have to know which mainboard chipset we are using for to enable fully AGP-speed, or is there an easier default way for most chipsets?

And how about PCIe-cards, are they also limitied to the PCI speed after booting DOS, or booting our own OS?

Dirk

Posted: **Thu Sep 27, 2012 3:49 am**

Does AGP count as a lack of searching effort?

EDIT: reading back the rest of the thread makes me think OS development is not yet a subject for you... I suggest you go read the required knowledge rule and either tackle something less difficult, or if you do meet those prerequisites, try not to show the current amount of laziness on your side.

Posted: **Thu Sep 27, 2012 5:55 am**

Hi,

freecrac wrote:
Brendan wrote:For 32-bpp modes you'd just copy one line at a time to the correct place in display memory, ...
And what is happend if we copy more than one line at a time from our Ram to the display memory?

For some video modes, display memory might look like this, where 'X' is visible and '-' is not:

Code: Select all

XXXXXXXX----
XXXXXXXX----
XXXXXXXX----
XXXXXXXX----

For example, you might have an 800 * 600 mode where display memory is arranged as 1024 * 600 (and the extra 224 pixels on the right are never seen).

If you write more than one line with "rep movsd" or "memcpy()" then you write data to the invisible portion of display memory, which is a waste of time/bandwidth. You could implement a special case (e.g. do "if(bytes_per_line == horizontal_resoluton * bytes_per_pixel) { ... } else { ... }"), but this is a waste of time as the bottleneck will be PCI bus bandwidth and not CPU speed. Basically, the special case won't improve performance and will make code maintenance worse.

freecrac wrote:I read about that if we use AGP cards, then we become only PCI speed, of no AGP acelleration, AGP chipset cache, or write combining(WC) is in use.
Do we have to know which mainboard chipset we are using for to enable fully AGP-speed, or is there an easier default way for most chipsets?

And how about PCIe-cards, are they also limitied to the PCI speed after booting DOS, or booting our own OS?

Doing things like tracking which pixels haven't changed (and avoiding unnecessary processing and/or unnecessary writes), buffering in RAM and ensuring writes are aligned and "large" (e.g. write dwords or qwords rather than writing individual bytes), and general optimisations in the drawing and blitting code can give you a huge speed up. For a simple example, I've seen people implement a "draw_rectangle()" function as a "for each row { for each column { put_pixel(colour, x, y); } }" loop that calculates the address of every individual pixel and writes (with little bytes/words) directly to display memory; and I can guarantee that (for a typical GUI type scenario) code like this can be done several hundred times faster (without using any hardware acceleration).

Messing about with things like AGP speeds and write combining can give a small speed up. Maybe up to 8 times faster for AGP, if you can still find a rare/old AGP card and chipset that actually does support it; and maybe up to 5% faster for write-combining if the code that does blitting is reasonable. This is nearly insignificant compared to the huge speed ups.

I guess what I'm trying to say is that you shouldn't bother wasting time on AGP and write-combining until "version 3" of your OS. Worry about those huge speed ups for "version 1" and "version 2".

Cheers,

Brendan

Posted: **Fri Sep 28, 2012 1:06 am**

Brendan wrote:I'd use the information about the format of pixels to filter out strange stuff, so that you only have to deal with 5 or 6 different pixel formats:

4-bpp planar (only if VBE says it is "VGA compatible"; where horizontal resolution is a multiple of 32)

8-bpp with palette (where horizontal resolution is a multiple of 4)

15-bpp "5R:5G:5B" (where horizontal resolution is a multiple of 2)

16-bpp "5R:6G:5B" (where horizontal resolution is a multiple of 2)

24-bpp "8R:8G:8B" (where horizontal resolution is a multiple of 4)

32-bpp "8R:8G:8B"
Then I'd use the horizontal and vertical resolution of the selected video mode to allocate a "32 bits per pixel" buffer in RAM (you'd need "horizontal * vertical * 4 bytes"). I'd do all drawing in this buffer in RAM. This means you only need to care about the video mode's pixel format when you blit data from your buffer in RAM to display memory.

That's not the way I do it. I have all my graphics code in several variants. Currently I have these:

1-bpp (monochrome bitmap)
15-bpp "5R:5G:5B" (where horizontal resolution is a multiple of 2)
24-bpp "8R:8G:8B" (where horizontal resolution is a multiple of 4)
32-bpp "8R:8G:8B"

These operate on a memory bitmap object. A method table is used to call the procedure relevant for the current bitmap organization. When the destination is LFB, I just copy over the affected bytes from the bitmap after doing the operation on the memory bitmap buffer. That means there are no conversion issues when displaying things.

Posted: **Sat Sep 29, 2012 6:40 am**

Brendan wrote:Hi,

freecrac wrote:
Brendan wrote:For 32-bpp modes you'd just copy one line at a time to the correct place in display memory, ...
And what is happend if we copy more than one line at a time from our Ram to the display memory?
For some video modes, display memory might look like this, where 'X' is visible and '-' is not:
Code: Select all
XXXXXXXX----
XXXXXXXX----
XXXXXXXX----
XXXXXXXX----
For example, you might have an 800 * 600 mode where display memory is arranged as 1024 * 600 (and the extra 224 pixels on the right are never seen).

If you write more than one line with "rep movsd" or "memcpy()" then you write data to the invisible portion of display memory, which is a waste of time/bandwidth. You could implement a special case (e.g. do "if(bytes_per_line == horizontal_resoluton * bytes_per_pixel) { ... } else { ... }"), but this is a waste of time as the bottleneck will be PCI bus bandwidth and not CPU speed. Basically, the special case won't improve performance and will make code maintenance worse.

Yes i know that the scanline can be longer as the width of the horizontal resolution. (I also mentioned it in one of my posting above too. Posted: Wed Sep 26, 2012 2:45 pm)

But for to write more than one line at a time i am thinking more about to use more than one writing instruction in the inner loop and not to write across the unvisible area with only one writing instruction.
Example for writing two lines (with two nested loops) with using an 8 bits per pixel mode with a resoltion of 800x600 and with a scanline of 1024 bytes:

Code: Select all

mov al, Color
mov edi, LFB-Address
mov ebx, Scanline
mov edx, (Scanline-Xres)*2  ; rest of the unvisible part of two lines
P1:
mov ecx, XRes
P2:
mov [edi], al               ; first write instruction for setting pixels for lines with an uneven number of 1,3,5,7...
mov [edi+ebx], al           ; second write instruction for setting pixels for lines with an even number of 2,4,6,8...
lea  edi, [edi+1]           ; increasing the address
dec ecx
jnz P2                      ; branch for to repeat to set pixels for to fill the lines
lea  edi, [edi+edx]         ; adding the unvisible rest of both lines to the address
lea  edi, [edi+ebx]         ; adding an additional scanline to the address, for to overleap the already filled lines with even numbers
cmp edi, scanline*YRes
jb P1

Note: This example above is only for to show how to use two write instructions for to set two pixel of two different lines in only one inner loop for drawing two lines in a time.
But it is not an example for to clear the screen or similar. (But i am not sure if it senseful for to use more than one instruction for a writing or for to copy a content to the framebuffer and from a buffer in the ram.)
Similar to this it is also possible to write more than two lines in a time using more than two write instructions in the inner loop.

freecrac wrote:I read about that if we use AGP cards, then we become only PCI speed, of no AGP acelleration, AGP chipset cache, or write combining(WC) is in use.
Do we have to know which mainboard chipset we are using for to enable fully AGP-speed, or is there an easier default way for most chipsets?

And how about PCIe-cards, are they also limitied to the PCI speed after booting DOS, or booting our own OS?
Doing things like tracking which pixels haven't changed (and avoiding unnecessary processing and/or unnecessary writes), buffering in RAM and ensuring writes are aligned and "large" (e.g. write dwords or qwords rather than writing individual bytes), and general optimisations in the drawing and blitting code can give you a huge speed up. For a simple example, I've seen people implement a "draw_rectangle()" function as a "for each row { for each column { put_pixel(colour, x, y); } }" loop that calculates the address of every individual pixel and writes (with little bytes/words) directly to display memory; and I can guarantee that (for a typical GUI type scenario) code like this can be done several hundred times faster (without using any hardware acceleration).

Yes i like to use buffering in the ram for drawing and for calculating a screen content and for to copy it complete to the framebuffer when all calculations are done.
Specially all readings of the contents of the framebuffer are very slow. So we have to use a buffer in the ram for to do this instead, for to speedup our routines.
In the first time i try it without a buffer in the ram to read the content of the framebuffer directly.
For example i wote a routine for to overblend one picture in an other picture with approaching their color components in some blending steps using the results of the calculation of some few percent of the color components of the target picture.

Messing about with things like AGP speeds and write combining can give a small speed up. Maybe up to 8 times faster for AGP, if you can still find a rare/old AGP card and chipset that actually does support it; and maybe up to 5% faster for write-combining if the code that does blitting is reasonable. This is nearly insignificant compared to the huge speed ups.

I guess what I'm trying to say is that you shouldn't bother wasting time on AGP and write-combining until "version 3" of your OS. Worry about those huge speed ups for "version 1" and "version 2".

Cheers,

Brendan

OK, maybe 8 times faster is only an insignificant small speed up.

Dirk

OSDev.org

VESA mode

Re: VESA mode

Re: VESA mode

Re: VESA mode

Re: VESA mode

Re: VESA mode

Re: VESA mode

Re: VESA mode

Re: VESA mode

Re: VESA mode

Re: VESA mode

Re: VESA mode

Re: VESA mode

Re: VESA mode

Re: VESA mode

Re: VESA mode