Page 2 of 2

Re: VESA standart PMode programming

Posted: Wed Mar 07, 2018 4:32 pm
by Brendan
Hi,
Korona wrote:
Brendan wrote:For my Windows machine (Haswell, HD4600) it's 1 GiB - far too large for "a few frame buffers and not much else". I'd assume that part of the reason for this is that a lot of games are optimised for discrete graphics (where there's a fixed amount of video RAM and higher latency/lower bandwidth for system memory accesses and no point using "shared between CPU and GPU" memory).
But you got that value from Windows or from the memory map and/or the BIOS?
I got that value from the BIOS. The settings the BIOS allows are 32 MiB, 64 MiB, 96 MiB, 128 MiB, 192 MiB, 224 MiB, 256 MiB, 320 MiB, 384 MiB, 448 MiB, 512 MiB, 1024 MiB; and it's was set to the maximum (1024 MiB).
Korona wrote:I do not think 1 GiB is supported if I read the docs correctly. The value should be in bits 3:7 of PCI register 0x50 on PCI bus 0, device 2, function 0. The value is encoded by some lookup table. I'm not copying the whole table here but the documents mention that 160 MiB (corresponding to value 5) is default and 512 MiB (corresponding to value 16) is max.
I took a look at the manual ("Chapter 12: PCIE Configuration Registers (Haswell)" from 12/18/2013), which (for PCI register 0x50 on PCI bus 0, device 2, function 0) says what each field of the register is for (e.g. "bits 3:7 = GSM/Graphics Mode Select: This field is used to select the amount of main memory that is pre-allocated to support the Internal Graphics Device....."), but doesn't provide any information on any of the values. I couldn't find any useful information for this register/field anywhere else in the manuals either.

Then I started looking at the manuals for other generations. The first one I found that actually had useful information (the fourth one I looked at) was "Intel OpenSource HD Graphics Programmer’s Reference Manual (PRM) Volume 3 Part 2: PCI Registers (Ivy Bridge); For the 2012 Intel CoreTM Processor Family" where the table has duplicate values with conflicting information (e.g. "5h = 32 MiB" at the top of the table, and then another "5h = 160 MiB" near the middle of the table). It looks like maybe the upper half of the table is an older draft that should've been deleted but wasn't and the bottom half of the table is correct. With that assumption, "10h = 512 MiB" and all higher values were reserved; so I'm thinking that for Haswell (the successor to Ivy Bridge) they might have defined a value that was reserved in the previous generation (e.g. "11h = reserved for Ivy Bridge but 1024 MiB for Haswell").

I also found a much newer manual from 2015 for "Iris Pro" which uses different registers, but has a similar "bits 15:8 = GSM: This field is used to select the amount of main memory that is pre-allocated to support the Internal Graphics Device....." field where the table includes "20h = 1024 MiB", "30h = 1536 MiB" and "3Fh = 2016 MiB". This at least suggests that the amount of memory that can be pre-allocated in Intel's integrated video might be increasing with each generation. Note: at first glance, "2016 MiB" looked like a typo to me, but then I realised it fit an early "amount of memory in 32 MiB chunks" theory I had - e.g. "3Fh * 32 MiB = 63 * 32 MiB = 2016 MiB").

In any case; after wandering randomly through manuals the only thing I can say without any doubt is that Intel's manuals are horrible/incomplete. ;)


Cheers,

Brendan

Re: VESA standart PMode programming

Posted: Thu Mar 08, 2018 6:54 pm
by MrLolthe1st
So, what about graphics perfomance? For 1024x768x3(eg, 10 operations per pixel) - its about 30 millions per one frame, about 8-9 frames per second, may be better? How to draw without CPU?

Re: VESA standart PMode programming

Posted: Thu Mar 08, 2018 8:13 pm
by Brendan
Hi,
MrLolthe1st wrote:So, what about graphics perfomance? For 1024x768x3(eg, 10 operations per pixel) - its about 30 millions per one frame, about 8-9 frames per second, may be better?
Recommended practice is to:
  • Get a list of video modes that the video card supports from wherever you can (VBE, UEFI UGA/GOP, native video driver, ...)
  • Filter that list to remove any video modes that the OS itself doesn't support
  • Filter that list to remove any video modes that the monitor doesn't support
  • Calculate some kind of rating or score for each remaining video mode
  • Choose the video mode with the best rating or score
The code to calculate a rating/score for a video mode can take many things into account - e.g. how well it matches the end user's preferences, an estimate of how much the video card and monitor might like the video mode (e.g. if it's the monitor's native resolution, or a multiple of that, to avoid/reduce scaling), how much RAM the computer has (and how much RAM might be consumed by software that generates graphics), how fast CPU/s are (and how much CPU time might be consumed by software that generates graphics), etc.

For an example, if you know that (for 1024x768x3) it's going to cost 30 millions cycles of CPU time per frame and that you want 60 frames per second; then that's 1.8 billion cycles of CPU time per second. On an old 80486 (which might only be capable of 0.025 billion cycles per second) you might give the video mode an extremely low score (and end up choosing something with a better score because it has a lower resolution); and on a modern computer (e.g. 8 CPUs running at 3 GHz, with a total of 24 billion cycles per second combined) you might give the video mode a high score (and end up choosing something with a better score because it has a higher resolution and looks better).
MrLolthe1st wrote:How to draw without CPU?
That depends on what you're drawing and what the hardware supports. For one example, if you want to draw 2D lines and 2D solid rectangles, then a lot of old video cards (e.g. SVGA cards from the 1990s) had special purpose accelerators specifically for 2D lines and 2D rectangles; but newer video cards no longer have them.

In general, "support for drawing" on modern hardware can be split into 3 categories:
  • bit blitting and conversion (primarily used for 2D graphics)
  • the 3D graphics pipeline (which mostly uses GPU, but includes a few fixed function pieces - e.g. for rasterisation)
  • movie decoders (e.g. for things like H.264)

Cheers,

Brendan

Re: VESA standart PMode programming

Posted: Sun Mar 25, 2018 1:39 pm
by Schol-R-LEA
Sorry to be coming back to this so late in the day, but I recently came across some videos which I thought might be relevant. The TL;DR, for these is that, if an AMD GPU - regardless of whether it is discrete or integrated - exceeds the amount of video RAM allocated to it, it will spill some of the memory usage to the system RAM, allocating some of the slower system memory for its own use and freeing it when finished.

According to one of the videos, this means that for most things, it is better to set an integrated GPU's dedicated frame buffer to the lowest settings rather than the highest, and let the OS and the driver manage the memory as needed - since the 'dedicated frame buffer' and the 'spill memory' are the same speed, and allocating more spill is trivial, then all you get out of a higher FB settings is a lot of wasted memory when not running 3D-accelerated programs.

There were exceptions to this, mostly older games which make hard-coded assumptions about size of the frame buffer, but their tests indicated that a large fixed FB was useless or even counter-productive - in some cases, the performance dropped when going from a fied FB of 1GiB to a 2GiB layout.

I don't know for certain if this applies to Intel iGPUs, and will need to look into it more to say, though given how limited those are in comparison - they are really only suitable for basic productivity applications, as Intel doesn't seem to consider using iGPUs for other things worthwhile - it is probably not relevant anyway, since if you are pushing an iGPU that hard you probably should be using a discrete video card anyway; the iGPU will often become a bottleneck long before the frame buffer size does.

Unfortunately, I cannot seem to find the video which I thought best explained this, but I will keep looking for it and if I can, post an update with that link as well.

I'm also throwing in a video that covers the difference between single vs dual channel memory, which is definitely relevant to performance even if it isn't a factor in the driver discussion.

EDIT: for some reason, the videos don't seem to be showing up in my browser. I don't know if this is my browser, my settings, or the site itself.

First video

Second video

Third video (single vs dual channel memory)