Page 1 of 1
memory-mapped device
Posted: Thu Jan 24, 2008 9:33 am
by sheena9877
Hi Everyone,
I am learning hardware programming for now and intent to do OS dev in the future.
I have this question in my head that I could not find an answer else where even if I google.
My question is:
If my computer's motherboard has 4 gig of memory and my video card has 64 mb of memory, will the total system memory will be 4gig + 64mb ?
Or still 4gig and the 64mb in the motherboard will be useless because it is duplicated by video card's memory. BTW I'm not considering the PAE mode.
Thanks,
Sheena
Posted: Thu Jan 24, 2008 10:04 am
by AJ
Hi,
IIRC, the computer will have 4GiB of memory only. Some of the 64MiB of video RAM will be mapped inside this region by VBE (so you'll lose the physical memory), but you won't necessarily be able to access all the video RAM unless you have enough data to write a specific driver for your video card (and thus have more control over the memory mappings).
Caution: Some video cards 'borrow' system RAM if they don't have it on board!
Cheers,
Adam
Posted: Thu Jan 24, 2008 11:31 am
by Telgin
Hmm, what about 64-bit addressing?
Now that I think about it, I'm not sure how that's handled in hardware. I'm pretty sure there's a way to either remap the video card's memory into a space above 4GB or something. Forgive me, I haven't actually gotten that far in OS devving yet.
Posted: Thu Jan 24, 2008 11:31 am
by sheena9877
Hi,
First of all thank very much.
Another question arise on my head.
If the system motherboard has 3 gig of memory and the video card has 64 mb of memory, will the system memory becomes 3 gig + 64 mb? If the answer is yes, can application use the additional 64 mb for all purposes like ordinary system ram?
Thanks,
Sheena
Posted: Thu Jan 24, 2008 4:07 pm
by Ready4Dis
sheena9877 wrote:Hi,
First of all thank very much.
Another question arise on my head.
If the system motherboard has 3 gig of memory and the video card has 64 mb of memory, will the system memory becomes 3 gig + 64 mb? If the answer is yes, can application use the additional 64 mb for all purposes like ordinary system ram?
Thanks,
Sheena
Yes, it is similar to 3gb + 64mb, and you *could* use video ram as normal ram, but keep in miind, writing to certain locations might produce unwanted results, for example, if you write to a memory mapped register you might screw something up. Also, writing to video memory is MUCH slower than system ram, since it needs to travel over the ISAPCI/AGP/PCI-E bus to get there and back, which incurs a lot of overhead and latency. If you have 3gb of ram, trying to use an extra 64mb isn't going to get you much of a boost, and the chance of breaking stuff isn't really worth the effort.
Posted: Thu Jan 24, 2008 7:22 pm
by sheena9877
Hi,
Thank you very much.
Why the hell, they selling video card with so much memory, we know system
ram is much cheaper and efficient?
You can answer this if you like.
Thank you again , bye.
Sheena
Posted: Thu Jan 24, 2008 9:50 pm
by Brendan
Hi,
sheena9877 wrote:Why the hell, they selling video card with so much memory, we know system
ram is much cheaper and efficient?
As Ready4Dis mentioned, for the CPU to access the video card's display memory it needs to travel over the ISAPCI/AGP/PCI-E bus, which makes it slow for the CPU to access video RAM. However the reverse is also true: for the video card to access the system RAM it also needs to travel over the ISAPCI/AGP/PCI-E bus, which makes it slow for the video card to access system RAM.
Now, imagine a 3D game with about 10000 textured polygons being drawn 60 times per second by the video card's 3D accelerator. The best place to put texture data for these polygons is in the video card's RAM, where the 3D accelerator can get it without travelling over the ISAPCI/AGP/PCI-E bus.
Now add page-flipping and/or triple buffering to this. 3 frame buffers plus 100 MB of texture data plus 10 MB of user-interface data (icons, mouse pointer/s, etc) adds up to a lot of RAM.
Then think about the electronics that converts data in display memory into signals for the monitor (for 1024 * 768 * 32 bits per pixel with a 60 Hz refresh rate, the electronics that converts data in display memory into signals for the monitor would read 180 MB of data per second).
If you add it all up it's a huge amount of data to drag over the ISAPCI/AGP/PCI-E bus. It's also a lot of memory bandwidth, which is why good video card's typically use much more expensive RAM (e.g. "dual ported RAM" that allows 2 reads to occur at the same time).
Note: For computers with "onboard video" that use system RAM instead of dedicated video RAM, you'd expect the onboard video to access RAM directly (without needing to travel over the ISAPCI/AGP/PCI-E bus). This isn't as bad as a seperate PCI card trying to access system RAM, but it's not as good as dedicated video RAM due to the large amount of bandwidth consumed by video (which makes normal/unrelated RAM accesses slower because they need to wait for "system RAM chip bandwidth").
More notes...
An OS could use unused video display memory as normal RAM, but it's slow for the CPU to access (partly because the RAM is typically treated as "uncacheable"), and you'd need the video driver to tell the OS that the video display memory won't ever be used and doesn't have side-effects first. A better idea would be to use the unused video display memory as swap space, because even though it's slow for the CPU to access it's faster than hard drives.
With 4 GB of system RAM, the chipset will make a "memory hole" below 4 GB (for the BIOS, PCI devices, APICs, etc), and you'd probably only be able to use 3 GB of the RAM with 32-bit addressing. The rest of the RAM may be remapped (by the chipset) to higher addresses where the CPU needs to use "more than 32-bit addressing" to access it (e.g. PAE, PSE36 or long mode).
Cheers,
Brendan
Posted: Thu Jan 24, 2008 11:20 pm
by sheena9877
Hi Brendan,
That helps a lot to me to understand the memory mapping of devices.
Nice sleep for me.
Thank you so much.
Sheena
Posted: Fri Jan 25, 2008 7:23 am
by Ready4Dis
Why are you so against the 36-bit PAE just curious? It allows up to 64gb instead of 4gb, but each application can still only see a maximum of 4gb and use 32-bit addressing, so no changes need to be made at all to your applications or drivers, just your memory table code and such. As mentioned, you *could* use the video memory as a swap disk of sorts if you know you have plenty extra, as long as you have a way to allocate memory and copy to/from video memory -> system memory, you could do this pretty easily, and it would be much more efficient than hard-disk access. Once you ran out of video memory, you could shrink your video memory swap space (by swapping it to disk) to make room for graphics related content. That's a bit beyond the scope of a beginner OS though. Besides, video ram does get used a lot in OS's, just look at this window you're reading now, it's got a ton of icons, buffers, button images mouse, etc, etc all in there, as well as your desktop icons, start menu + bar, etc, etc. It all adds up. Besides, if ypu are hurting for memory that bad, switch to 36-bit pae, or use 36-bit PSE and buy some more, it's much cheaper than buying a new video card for extra ram
.
Brendan gave a good description of how video memory works (to follow up on what I said about it being slow), so now you know why it's typically just ignored for general purpose use. Also, system ram is typically used for software graphics, since it's much faster to read/write to, then the final screen is copied in one go through the bus (since a single large copy is much faster than many many reads/writes), this does 2 things, 1.) speeds up your raster operations, and 2.) has your double buffer so you don't get tearing. The video memory is very fast if used directly by the GPU, but not to useful for the CPU.
Posted: Fri Jan 25, 2008 3:58 pm
by sheena9877
Hi,
Also, system ram is typically used for software graphics, since it's much faster to read/write to, then the final screen is copied in one go through the bus (since a single large copy is much faster than many many reads/writes)
Thank you for more explanation but I got confused again.
Why you use the system ram for software graphics, I thought the video card's ram is better when it comes to graphics.
Besides, I undestand when we got the Linear Frame Buffer address from VESA, we start writing to this address when we are doing graphics.
Maybe I am wrong again.
BTW, for now I will avoid PAE and PSE because I am a newbie.
Thank you,
Sheena.
Posted: Sat Jan 26, 2008 12:05 am
by Brendan
Hi,
sheena9877 wrote:Why you use the system ram for software graphics, I thought the video card's ram is better when it comes to graphics.
It's common for our OSs to do a lot of things in software (with the CPU) instead of with hardware (e.g. the GPU) because it's extremely hard for us to support GPUs, 3D accelerators, etc on all video cards. Instead we end up with most cards using VBE/VESA instead (with no hardware acceleration), and for things like alpha blending, 2D blits, etc to be done by the CPU. Heck - I've even used the CPU to draw 3D polygons before.
So, if the CPU is doing a lot of work with the graphics data, then it makes sense to do a lot of work in system RAM (which the CPU can access relatively quickly) and then send the completed frame of data to the video card's display memory (which the CPU can't access as quickly).
In this case (VBE/VESA video with no hardware acceleration) it's also likely that there'll be a lot of video RAM unused (RAM that could be used by the OS for swap space, for e.g.).
Now, if the OS supports hardware accelerated 2D and 3D, etc then things are entirely different - the CPU does very little work on the graphics data and the GPU does a lot of work on it, so it makes sense to do a lot of work in video RAM (which the GPU can access relatively quickly) instead of system RAM (which the GPU can't access as quickly).
sheena9877 wrote:Besides, I undestand when we got the Linear Frame Buffer address from VESA, we start writing to this address when we are doing graphics.
Maybe I am wrong again.
You're forgetting that video cards have extremely powerful processors built into them that are completely unused when the OS relies on VESA/VBE and a frame buffer...
It reminds me of a computer I was meant to fix once - an old 166 MHz Pentium with 64 MB of RAM, that had a modern NVidia video card with 128 MB of video memory. The video card had more processing power and more RAM than the rest of the computer.
Cheers,
Brendan
Posted: Sat Jan 26, 2008 5:19 am
by Ready4Dis
sheena9877 wrote:Hi,
Also, system ram is typically used for software graphics, since it's much faster to read/write to, then the final screen is copied in one go through the bus (since a single large copy is much faster than many many reads/writes)
Thank you for more explanation but I got confused again.
Why you use the system ram for software graphics, I thought the video card's ram is better when it comes to graphics.
Besides, I undestand when we got the Linear Frame Buffer address from VESA, we start writing to this address when we are doing graphics.
Maybe I am wrong again.
BTW, for now I will avoid PAE and PSE because I am a newbie.
Thank you,
Sheena.
Yes, as explained a bit, the buffer in memory that VESA returns is the final location you will write the screen to. But, you don't want to draw directly to it, because a.) it's slow, so if you draw a lot of things, and have overlaps, etc it will be really slow, and b.) if you are drawing in the middle of a refresh you can see 1/2 of your object drawn, then on the next refresh the other 1/2 will be drawn (aka, tearing or flickering). Software ram is faster for the CPU to access, so if you're not using the hardware acceleration (and we're talking about vesa, so you will be using software rasterizing), then it's faster to read/write to regular ram, so doing all your operations in system ram is much faster. Then once you're done rendering the image, you would wait for a vertical refresh to happen (the time when the last pixel was just sent to the monitor), then do a full copy in one go, so when the display starts reading the memory again, the entire screen is present so you get much a much smoother looking screen (no tears/flickering). Now, if you are supporting hardware acceleration, storing as much as possible in video ram is the best, since the video hardware has faster access to video ram. Just depends on who is doing the work, the CPU or the GPU. CPU operations are best done in System Memory, while GPU operations are best done in Video Memory. When you are using VESA, you are only using a very small part of the GPU, basically just DAC (digital to analog converter, to convert your pixel data to an analog signal for monitors) and a small part of ram to store the video buffer.
Imagine this: To perform a dword write over the PCI bus you need to send much more infromation than just the dword, you need a command + destination + payload. So, for every dword you write, you are using lets say 3 dwords (not sure the exact amount off the top of my head)... now if you go writing to the video buffer 1 dword at a time, you are talking about sending 12 bytes per 4-bytes of data. Now, the PCI bus is only 33mhz, and 32-bits wide, so these transfers are going to be slow, have a lot of overhead, and will be fighting with other devices for bandwidth (while starving other devices in the process). Now, say we draw the entire screen with NO overlaps @ 640x480x32. That is 307,200 pixels, @ 12 bytes per pixel instead of 4.... which equates to 3mb of data for each screen refresh, and at 33mhz and 32-bit, we're looking at about 100ms per frame, or 10fps, and that isn't even taking into account other devices, or latency, or etc. Now, lets do the same to video memory which runs at over 200mhz @ 128-bit.... and only requires 4-bytes per pixel and has much less latency, you can see that it will be a lot more efficient. Then when we're done, we simply copy the entire buffer in one go, so the overhead is greatly reduced. Now we're transfering a little over 1mb, which is MUCH better than 3mb
.
Now, lets look at it from a GPU side of things if we are supporting hardware acceleration. It's pretty much the exact opposite:
GPU wants to blit a sprite to screen... if it has the data in GPU memory, we issue a command over the PCI bus to copy sprite -> buffer. The GPU does the copy while our CPU continues onto other things. If the sprite was in system memory, our CPU tells the video card to blit the sprite, then the GPU has to go through the PCI bus to get the actual sprites data, which we already determined going through the PCI bus is slow. So, when working with graphics, always know which side of the bus you are on, and try to avoid crossing it when possible. Of course there is the initial transfer of the data to video memory, but transferring it once is much better than once per frame!
Posted: Sat Jan 26, 2008 3:05 pm
by sheena9877
Hi,
Brendan and Ready4dis, you deserve an applause.
Your explanation outclass the book Programmer's Guide to EGA,SVA and Super VGA Cards by Mr. Ferraro. No matter how many times I read that book, I still could not understand the working of CPU, GPU, and VESA when dealing with graphics.
You clear my cloudy brain on this subject with couple of paragraphs.
Thank you sooooooooo much.
Sheena
Posted: Sat Jan 26, 2008 3:13 pm
by Ready4Dis
sheena9877 wrote:Hi,
Brendan and Ready4dis, you deserve an applause.
Your explanation outclass the book Programmer's Guide to EGA,SVA and Super VGA Cards by Mr. Ferraro. No matter how many times I read that book, I still could not understand the working of CPU, GPU, and VESA when dealing with graphics.
You clear my cloudy brain on this subject with couple of paragraphs.
Thank you sooooooooo much.
Sheena
Not a problem, thanks for the compliment. I hope it shed some light, and helps you to develop a much more efficient OS
.