Graphics API and GUI

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Graphics API and GUI

Post by Brendan »

Hi,
Ready4Dis wrote:
Um.. I say it's fairly easy when the video driver does it (and hard when the game does it); and you say it's hard when the game does it (just ask game developers!)?
I've done a lot of graphics programming, before and after 3d accelerators where available, I still don't think it's easy to do from either standpoint.
I honestly can't see why you think it might be difficult. If I told you the graphics pipeline (for converting "3D space" into "2D texture") has 3 major stages, and last time the first stage took 1234 us and there were 100 objects but now there are 200 objects, do you think you could estimate how long the first stage might take this time? What if I said the third stage took 444 us last time and you knew it's only effected by number of pixels and isn't effected by number of objects - would you guess that if the number of pixels hasn't changed it might take 444 us again?

Sure, it'd be hard to calculate the exact amount of time it will take, but there's no need for that level of accuracy.
Ready4Dis wrote:
Z-buffer doesn't stop you from rendering objects; but it's the last/final step for occlusion (and was only mentioned as it makes BSP redundant and is extremely common). Earlier steps prevent you from rendering objects.

For a very simple example; you can give each 3D object a "bounding sphere" (e.g. distance from origin to furthermost vertex) and use that to test if the object is outside the camera's left/right/top/bottom/foreground/background clipping planes and cull most objects extremely early in the pipeline. It doesn't take much to realise you can do the same for entire collections of objects (and collections of collections of objects, and ...).

Of course typically there's other culling steps too, like culling/clipping triangles to the camera's clipping planes and back face culling, which happen later in the pipeline (not at the beginning of the pipeline, but still well before you get anywhere near Z-buffer).
It doesn't make BSP redundant, a lot of BSP renderers can turn off Z checks due to their nature, which saves the gpu from having to do read backs (which are slow). Yes, that's the point of octree's, portals, bsp's, etc. To only render what it needs, but which one you use depends on the type of world/objects you're rendering.
Erm.

BSP ensures things are rendered in a specific order. The silly way to use BSP is to render everything in "back to front" order so that Z checks aren't needed (but everything rendered ends up doing expensive "textel lookup" instead even if/when its overwritten later). The smart way to use BSP is to render everything in "front to back" order with Z checks; so that "textel lookup" can be skipped for everything that's occluded. Of course this assumes opaque polygons (and for anything transparent you probably want to do them "back to front" with Z checks after all of the opaque stuff is done, which doesn't fit well with either use of BSP).
Ready4Dis wrote:Otherwise everyone would use the same exact thing. My point was, without these structures, how does the video driver know what is visible quick and in a hurry?
Why do you think the video driver needs to know what is visible in a hurry? In the early stages (including estimation) it only needs to know what is "possibly visible" in a hurry, and if the video driver thinks something is "possibly visible" and finds out that it's actually hidden behind something else later in the pipeline, then who cares? Worst case is the frame finishes a tiny fraction of a milli-second sooner, and maybe (but more likely not) something that it might have been able to draw with slightly higher quality is done with slightly worse quality. It's not like the user is going to notice or care.
Ready4Dis wrote:Also, most BSP schemes use the BSP map and accompanying data in order to quickly do collision detection (like checking for the interaction of the player and an object). For it to work, it needs the transformed data and the BSP. If you push everything to the video driver, you still need your map for the game meaning it's even slower still since you need to ask the video driver for information or keep two copies.
Physics and collision detection should have nothing at all to do with graphics in any way at all. The fact that game developers are incompetent morons that have consistently done it wrong does not change this.

Imagine a railroad track. A red train is moving at 100 m/s going from west to east and is 200 meters to the west of you. A blue train is moving at 50 m/s and is 100 meters to the east of you. Calculate when the trains will collide. This is a 1 dimensional problem (in that both trains are on the same linear track). To calculate when they will collide (and then where both trains will be when they collide) requires 2 dimensional calculations - essentially; it's the intersection of lines in 2D (where one of the dimensions is time).

Now imagine a billiard table with a white ball and black ball that happen to be on a collision course. This is a 2D problem (in that both balls roll along the same 2D plane). To calculate when they will collide (and then where the balls will be when they collide) requires 3 dimensional calculations - essentially; it's the intersection of 2 lines in 3D (where one of the dimensions is time).

Finally, imagine a 3D game. This time it's a bullet and a racing car. To calculate when they will collide (and then where they will be when they collide) requires 4 dimensional calculations - essentially; it's the intersection of 2 lines in 4D (where one of the dimensions is time).

Note: I've simplified a lot and it's not "intersection of 2 lines" for anything more than very simple cases. More correctly; typically you're calculating then point in time where the distance between the circumferences of bounding circles (2D) or bounding spheres (3D) reaches zero as your first test, and then doing individual "extruded line" or "extruded polygon" intersection tests on individual lines/polygons after that. Suffice to say it is a "N+1 dimensions" problem, and not tied to a game tick or frame rate; and for performance reasons should not be using the game's graphics data (e.g. 100 polygons for a dog's face) and needs to use much simpler geometry (6 polygons for a dog's face); and for accuracy and performance reasons should support primitive shapes (cones, spheres, cubes, cylinders) directly so that (e.g.) a ball doesn't need to be broken up into polygons at all .

In addition to this, I personally think that all games (including single player) need to be using a client/server model where the server and client/s are synchronised via. time (e.g. starting/ending trajectories), and where physics is done on server and graphics is done on the client/s (and where client can crash and burn, and be restarted, without any loss of game data).
Ready4Dis wrote:
For me; the OS will probably have to wait for network and authentication before it can become part of its cluster, before it can start loading its video driver from the distributed file system. There's also plenty of scope for "Oops, something went wrong during boot" where the OS never finishes booting and the user needs to know what failed (which is the primary reason to display the boot log during boot).
I just meant the need for graphics mode for my boot log, not the need for the boot log. Of course I want to be able to see something if it fails, but it doesn't necessarily have to be graphical (especially if it happens before or during the initial graphics routines).
I don't use text mode, I use the firmware's text output functions (like "EFI_SIMPLE_TEXT_OUTPUT_PROTOCOL" on UEFI; that might or might not be using text mode). During boot the OS reaches a "point of no return" where firmware is discarded and the firmware's text output functions can no longer be used (e.g. "EFI_EXIT_BOOT_SERVICES" is used). Immediately before this point of no return I switch video card/s to graphics mode, so that the entire OS (if there's no native video driver and its left with the generic frame buffer driver only) can use graphics; but this means that after the point of no return the boot code must also use graphics mode (up until the video driver/s are started). Note that (for me) this point of no return occurs before boot code has decided which micro-kernel (e.g. 32-bit/64-bit, with/without NUMA, etc) it should start.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Ready4Dis
Member
Member
Posts: 571
Joined: Sat Nov 18, 2006 9:11 am

Re: Graphics API and GUI

Post by Ready4Dis »

I honestly can't see why you think it might be difficult. If I told you the graphics pipeline (for converting "3D space" into "2D texture") has 3 major stages, and last time the first stage took 1234 us and there were 100 objects but now there are 200 objects, do you think you could estimate how long the first stage might take this time? What if I said the third stage took 444 us last time and you knew it's only effected by number of pixels and isn't effected by number of objects - would you guess that if the number of pixels hasn't changed it might take 444 us again?
I guess if you aren't looking for any sort of accuracy then this might be true, the part I don't think is trivial is determining *what* to display at lower qualities to make up for the time.
Erm.

BSP ensures things are rendered in a specific order. The silly way to use BSP is to render everything in "back to front" order so that Z checks aren't needed (but everything rendered ends up doing expensive "textel lookup" instead even if/when its overwritten later). The smart way to use BSP is to render everything in "front to back" order with Z checks; so that "textel lookup" can be skipped for everything that's occluded. Of course this assumes opaque polygons (and for anything transparent you probably want to do them "back to front" with Z checks after all of the opaque stuff is done, which doesn't fit well with either use of BSP).
It really depends on the application and video hardware/software. If you are targeting only newer hardware with fast z-buffer read backs, then you want to render front to back with z-buffer checks turned on. If the hardware is older and z-buffer read backs have a serious impact on performance than you want to do back to front. Typically BSP's are front to back with z-checks now (has been that way for a while actually, not sure why I went all nostalgic). Either way, the BSP tree isn't just for rendering, but even the rendering part is important on how the scene gets rendered to minimize the # of texture lookups and shader routines. If you don't have a BSP tree for an indoor scene, how does the video driver sort the incoming information to present it to the GPU in a organized order? Or how would I tell the video driver what is a good order? Don't take this the wrong way, if you can get it to work well, it would simplify game developers jobs a lot and they could focus more on the game play aspects. I understand the theory, I just can't envision the implementation details without the video driver being a huge bloated mess. DirectX went that way, and so did OpenGL for a while, and now they are going towards minimal api so games can have more control and run faster without the drivers getting more and more bloated as they try to incorporate workarounds and such for each and every game.
Why do you think the video driver needs to know what is visible in a hurry? In the early stages (including estimation) it only needs to know what is "possibly visible" in a hurry, and if the video driver thinks something is "possibly visible" and finds out that it's actually hidden behind something else later in the pipeline, then who cares? Worst case is the frame finishes a tiny fraction of a milli-second sooner, and maybe (but more likely not) something that it might have been able to draw with slightly higher quality is done with slightly worse quality. It's not like the user is going to notice or care.
I understand, worst case it runs a bit faster and looks like crap, nobody actually cares how their games look anyways ;).
Physics and collision detection should have nothing at all to do with graphics in any way at all. The fact that game developers are incompetent morons that have consistently done it wrong does not change this.
?? Does this even need a real response? A BSP for an indoor scene works really well for checking objects that are near you for collision. Yes, the collision detection itself is based on the object, but with a BSP you only need to check collision with a VERY limited set, rather than everything (and it was just an example, octree's do a similar function by separating the world into smaller spaces allowing the collision detection to only happen with close objects). I do agree, when doing collision detection it makes sense to do it for a much more simplified version of the object in most cases, but knowing what objects you need to check and which ones aren't even a possibility can make this go much faster. Also, while not using graphics data sounds great in theory,there is still a lot of shared information. Try doing inverse kinematics without having the final transformations for the bone structure of the object you are animating. Got a large open world racing game, how do you do collision detection with the tires and track? Are you storing a simplified version of the track in memory instead? If you don't split this data into smaller areas, do you just randomly check everything too see what each tire is in contact with?
In addition to this, I personally think that all games (including single player) need to be using a client/server model where the server and client/s are synchronised via. time (e.g. starting/ending trajectories), and where physics is done on server and graphics is done on the client/s (and where client can crash and burn, and be restarted, without any loss of game data).
While I didn't use a client/server model, my game engine did separate the graphics from the input and physics. All the input/physics where calculated in one thread, and the graphics in another. If the graphics where going really slow, the physics and input were still running at normal speed.
I don't use text mode, I use the firmware's text output functions (like "EFI_SIMPLE_TEXT_OUTPUT_PROTOCOL" on UEFI; that might or might not be using text mode). During boot the OS reaches a "point of no return" where firmware is discarded and the firmware's text output functions can no longer be used (e.g. "EFI_EXIT_BOOT_SERVICES" is used). Immediately before this point of no return I switch video card/s to graphics mode, so that the entire OS (if there's no native video driver and its left with the generic frame buffer driver only) can use graphics; but this means that after the point of no return the boot code must also use graphics mode (up until the video driver/s are started). Note that (for me) this point of no return occurs before boot code has decided which micro-kernel (e.g. 32-bit/64-bit, with/without NUMA, etc) it should start.
So the point of no return is BEFORE you decide the what kernel to run? Doesn't that mean there won't be a native video driver yet, and everything will be software until the kernel and driver get loaded? The video driver is one of the first things my OS loads as part of the boot process, but i don't load the driver before the kernel, nor do I want to write two sets of graphics rendering pipelines for before/after the video driver is loaded.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Graphics API and GUI

Post by Brendan »

Hi,
Ready4Dis wrote:
I honestly can't see why you think it might be difficult. If I told you the graphics pipeline (for converting "3D space" into "2D texture") has 3 major stages, and last time the first stage took 1234 us and there were 100 objects but now there are 200 objects, do you think you could estimate how long the first stage might take this time? What if I said the third stage took 444 us last time and you knew it's only effected by number of pixels and isn't effected by number of objects - would you guess that if the number of pixels hasn't changed it might take 444 us again?
I guess if you aren't looking for any sort of accuracy then this might be true, the part I don't think is trivial is determining *what* to display at lower qualities to make up for the time.
To be honest; I don't think it'll be trivial either. Not because I think it's hard, but because there's so many options and it's going to take a lot of testing and balancing and tuning to get the best results under a wide variety of conditions. Is it better to render at lower resolution (and upscale), or reduce the render distance, or use existing impostors that are at wrong angles, or only take fewer light sources into account, or use solid colours instead of textures for more small/distant polygons, or switch to alternative "low poly" meshes for some things, or...
Ready4Dis wrote:
BSP ensures things are rendered in a specific order. The silly way to use BSP is to render everything in "back to front" order so that Z checks aren't needed (but everything rendered ends up doing expensive "textel lookup" instead even if/when its overwritten later). The smart way to use BSP is to render everything in "front to back" order with Z checks; so that "textel lookup" can be skipped for everything that's occluded. Of course this assumes opaque polygons (and for anything transparent you probably want to do them "back to front" with Z checks after all of the opaque stuff is done, which doesn't fit well with either use of BSP).
It really depends on the application and video hardware/software. If you are targeting only newer hardware with fast z-buffer read backs, then you want to render front to back with z-buffer checks turned on. If the hardware is older and z-buffer read backs have a serious impact on performance than you want to do back to front. Typically BSP's are front to back with z-checks now (has been that way for a while actually, not sure why I went all nostalgic).
Sounds like a decision that the video driver (that has intimate knowledge of the video card's hardware) should make rather than a game (which should never have needed to take into account the specific details of the underlying hardware in the first place).
Ready4Dis wrote:Either way, the BSP tree isn't just for rendering, but even the rendering part is important on how the scene gets rendered to minimize the # of texture lookups and shader routines. If you don't have a BSP tree for an indoor scene, how does the video driver sort the incoming information to present it to the GPU in a organized order? Or how would I tell the video driver what is a good order? Don't take this the wrong way, if you can get it to work well, it would simplify game developers jobs a lot and they could focus more on the game play aspects. I understand the theory, I just can't envision the implementation details without the video driver being a huge bloated mess. DirectX went that way, and so did OpenGL for a while, and now they are going towards minimal api so games can have more control and run faster without the drivers getting more and more bloated as they try to incorporate workarounds and such for each and every game.
Graphics will always be complicated and messy. Shifting half of this complexity into games doesn't reduce the complexity involved, it replicates it.

The graphics APIs were always too low level. That's why the communication overheads where too high (a problem they "solved" by making the APIs even lower level). That's also why drivers have to incorporate workarounds for each and every game now. These are problem a higher level API avoids.
Ready4Dis wrote:
Why do you think the video driver needs to know what is visible in a hurry? In the early stages (including estimation) it only needs to know what is "possibly visible" in a hurry, and if the video driver thinks something is "possibly visible" and finds out that it's actually hidden behind something else later in the pipeline, then who cares? Worst case is the frame finishes a tiny fraction of a milli-second sooner, and maybe (but more likely not) something that it might have been able to draw with slightly higher quality is done with slightly worse quality. It's not like the user is going to notice or care.
I understand, worst case it runs a bit faster and looks like crap, nobody actually cares how their games look anyways ;).
Don't be stupid. The performance difference between (e.g.) "fragment/s skipped due to final Z test" and "fragment/s not skipped due to final Z check" isn't going to be so massive that the driver would've been able to improve graphics quality for anything in a way that's actually noticeable by any human for 1/60th of a second.
Ready4Dis wrote:
Physics and collision detection should have nothing at all to do with graphics in any way at all. The fact that game developers are incompetent morons that have consistently done it wrong does not change this.
?? Does this even need a real response? A BSP for an indoor scene works really well for checking objects that are near you for collision. Yes, the collision detection itself is based on the object, but with a BSP you only need to check collision with a VERY limited set, rather than everything (and it was just an example, octree's do a similar function by separating the world into smaller spaces allowing the collision detection to only happen with close objects).
Are you saying that a physics engine should use a BSP and that this has nothing to do with graphics; or are you saying that a physics engine should use an octree and that this has nothing to do with graphics? For physics, using an octree seems like a more sensible approach to me.

Note that a BSP (which is expensive to generate) is inappropriate for anything other than collisions with static objects.
Ready4Dis wrote:I do agree, when doing collision detection it makes sense to do it for a much more simplified version of the object in most cases, but knowing what objects you need to check and which ones aren't even a possibility can make this go much faster. Also, while not using graphics data sounds great in theory,there is still a lot of shared information. Try doing inverse kinematics without having the final transformations for the bone structure of the object you are animating.
Because any transformation is fine for inverse kinematics, even if it's a transformation to screen co-ords and not a transformation to the local coordinate system of the object you're animating?
Ready4Dis wrote:Got a large open world racing game, how do you do collision detection with the tires and track?
You don't. You trace paths across the surface of the track because the tires are always (or at least, nearly always) in contact.
Ready4Dis wrote:Are you storing a simplified version of the track in memory instead?
For physics I'm storing the track as splines, because it's stupid to limit yourself to polygons just because someone thought they'd be nice for something completely irrelevant to physics.
Ready4Dis wrote:
I don't use text mode, I use the firmware's text output functions (like "EFI_SIMPLE_TEXT_OUTPUT_PROTOCOL" on UEFI; that might or might not be using text mode). During boot the OS reaches a "point of no return" where firmware is discarded and the firmware's text output functions can no longer be used (e.g. "EFI_EXIT_BOOT_SERVICES" is used). Immediately before this point of no return I switch video card/s to graphics mode, so that the entire OS (if there's no native video driver and its left with the generic frame buffer driver only) can use graphics; but this means that after the point of no return the boot code must also use graphics mode (up until the video driver/s are started). Note that (for me) this point of no return occurs before boot code has decided which micro-kernel (e.g. 32-bit/64-bit, with/without NUMA, etc) it should start.
So the point of no return is BEFORE you decide the what kernel to run? Doesn't that mean there won't be a native video driver yet, and everything will be software until the kernel and driver get loaded?
Yes.
Ready4Dis wrote:The video driver is one of the first things my OS loads as part of the boot process, but i don't load the driver before the kernel, nor do I want to write two sets of graphics rendering pipelines for before/after the video driver is loaded.
From the "point of no return" up until the video driver is started; I need to:
  • Reclaim any RAM boot loader was using
  • Disable all PCI devices that can be disabled
  • Configure IOMMU (if present) as "most restrictive possible"
  • Establish a dynamic root of trust (if possible)
  • Start AP CPUs (if not started already due to SKINIT in the previous step)
  • Do full CPU feature detection to determine what all CPUs (not just BSP) support
  • Restore IOMMU (if present) back to normal
  • Determine which kernel to use
  • Start "kernel setup module"
  • Do "state hand-off" from boot manager to kernel setup module
  • Reclaim any RAM boot manager was using
  • Reconfigure the virtual address space to suit the micro-kernel
  • Start the micro-kernel; including telling micro-kernel about boot modules (e.g. "boot video" module) so it can convert them into processes and inform them kernel is up/usable
  • Start initialisation process, and let kernel setup module self-terminate
  • Start VFS service
  • Pre-populate VFS cache with file/s from boot image/init RAM disk
  • Start device manager process
  • Start motherboard driver (if present)
  • Do device enumeration phase 1 (planning resources/BARs, etc)
  • Configure MTRRs, etc to suit device enumeration results
  • Do device enumeration phase 2 (enabling PCI devices, and starting drivers if they're in VFS cache - mostly only disk and network)
  • Get local native file systems up (if any)
  • Get network up (DHCP, node discovery, etc)
  • Do remote attestation with other nodes in the cluster
  • Sync time with other nodes in cluster
  • Sync local file system (if any) with remote nodes/distributed FS
  • Start drivers that weren't in VFS cache earlier (including video driver)

Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Ready4Dis
Member
Member
Posts: 571
Joined: Sat Nov 18, 2006 9:11 am

Re: Graphics API and GUI

Post by Ready4Dis »

To be honest; I don't think it'll be trivial either. Not because I think it's hard, but because there's so many options and it's going to take a lot of testing and balancing and tuning to get the best results under a wide variety of conditions. Is it better to render at lower resolution (and upscale), or reduce the render distance, or use existing impostors that are at wrong angles, or only take fewer light sources into account, or use solid colours instead of textures for more small/distant polygons, or switch to alternative "low poly" meshes for some things, or...
Yeah, I mean, I don't think it's impossible, just implausible. If you ever get anywhere with it (even just from a design standpoint), I would be interested, maybe even write a driver for it and see how it goes, or put a compatible driver into my OS if it's anywhere ready. Yes, there will be a lot of balancing, and the problem I see is that the balancing is typically different per application (or genre?).
Sounds like a decision that the video driver (that has intimate knowledge of the video card's hardware) should make rather than a game (which should never have needed to take into account the specific details of the underlying hardware in the first place).
Who is writing the video driver? Is this going to be like common code for handling/balancing and then there is a really low level gpu driver? Is this all part of one driver that the manufacturer needs to write?
The graphics APIs were always too low level. That's why the communication overheads where too high (a problem they "solved" by making the APIs even lower level). That's also why drivers have to incorporate workarounds for each and every game now. These are problem a higher level API avoids.
The graphics drivers where doing to much of the handling in a generic manner and it just didn't work well for all games, one size does not fit all (in general).
Are you saying that a physics engine should use a BSP and that this has nothing to do with graphics; or are you saying that a physics engine should use an octree and that this has nothing to do with graphics? For physics, using an octree seems like a more sensible approach to me.
I am saying the game might use one of many methods to deal with physics and it may or may not deal with the geometry directly. Different games and situations call for different methods of handling. If you do everything in the video driver it's either going to be really huge or limited without some sort of bypasses for games that need the extra bit of control.
Because any transformation is fine for inverse kinematics, even if it's a transformation to screen co-ords and not a transformation to the local coordinate system of the object you're animating?
I'm not really sure the questions, but my point was some games do collision detection with objects based on all the transformations (that take place on the GPU). If the video driver is storing all of this information itself, the physics engine *may, not always* need it or it will have to duplicate the efforts.
You don't. You trace paths across the surface of the track because the tires are always (or at least, nearly always) in contact.
For physics I'm storing the track as splines, because it's stupid to limit yourself to polygons just because someone thought they'd be nice for something completely irrelevant to physics.
That sounds reasonable, and is probably where games are going with dynamically generated polygons based on splines, which allows you to "easily" do LOD, and it might be common by the time either of our OS's is anywhere near ready, not to mention physics will probably be done on the GPU using opencl or simlar anyways.

That's a pretty extensive boot list. Mine isn't to disimilar although not being focused on being a distributed OS it is a bit simpler. My boot loader loads my boot manager and initrd. My boot loader generates the inital memory map for the kernel and sets up paging, recovers all ram from boot loader (and sets the memory map that sets its own memory as free). It also boots AP's and they all jump into the kernel in the init rd. The kernel is determined during installation, so there isn't a decision to be made (this may change later). I think read a config file which tells my kernel which drivers to load and in which order. I am working on also setting up a dependency list so I can use all the processors to load drivers from the start. I still need to work on it a bit more, but it works ok for now. Everything is configurable, I don't have an order that it has to start in (VFS can start last if nothing depends on it if you'd like, but if it has a dependency, then it will be loaded before whatever depends on it). Right now I have my video driver inside of the initfs, but this may move depending, or I may do something like set a graphics mode before switching to pmode and render everything with software until a proper driver can be loaded, although that is duplicating effort and putting software rendering code into my boot loader making it take even longer to boot. I am still undecided on this.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Graphics API and GUI

Post by Brendan »

Hi,
Ready4Dis wrote:
To be honest; I don't think it'll be trivial either. Not because I think it's hard, but because there's so many options and it's going to take a lot of testing and balancing and tuning to get the best results under a wide variety of conditions. Is it better to render at lower resolution (and upscale), or reduce the render distance, or use existing impostors that are at wrong angles, or only take fewer light sources into account, or use solid colours instead of textures for more small/distant polygons, or switch to alternative "low poly" meshes for some things, or...
Yeah, I mean, I don't think it's impossible, just implausible. If you ever get anywhere with it (even just from a design standpoint), I would be interested, maybe even write a driver for it and see how it goes, or put a compatible driver into my OS if it's anywhere ready. Yes, there will be a lot of balancing, and the problem I see is that the balancing is typically different per application (or genre?).
Some games will describe their graphics as (e.g.) "one node in the graph per room", some games will describe their graphics as (e.g.) "one node in the graph per sector of huge open area", etc; and different games will use different meshes, textures, etc. These differences are trivial for the video driver to cope with.

Beyond that, the majority of the remaining differences are caused by the fact that existing graphics APIs are silly and force games to repeatedly reinvent the wheel, combined with the fact that no 2 programmers ever implement the same thing in an identical way.

There are no significant "per application/game/GUI/genre" differences.
Ready4Dis wrote:
Sounds like a decision that the video driver (that has intimate knowledge of the video card's hardware) should make rather than a game (which should never have needed to take into account the specific details of the underlying hardware in the first place).
Who is writing the video driver? Is this going to be like common code for handling/balancing and then there is a really low level gpu driver? Is this all part of one driver that the manufacturer needs to write?
Whoever writes a video driver will write a video driver. The initial "raw framebuffer, software rendering only" generic video driver will be written by me and will be open source, specifically so that anyone writing any other video driver can take whatever they want from it (and customise their copy in any way they like to suit their driver and/or whichever video card their driver is for).
Ready4Dis wrote:
The graphics APIs were always too low level. That's why the communication overheads where too high (a problem they "solved" by making the APIs even lower level). That's also why drivers have to incorporate workarounds for each and every game now. These are problem a higher level API avoids.
The graphics drivers where doing to much of the handling in a generic manner and it just didn't work well for all games, one size does not fit all (in general).
Because they don't have a sane higher level graphics API to use, game developers implement their own and called it a game engine, and typically do use the game engine's higher level API for many very different games because one size does fit all. This is much harder than what I'm planning because these game engines typically do much more than just graphics alone (e.g. physics, sound, etc).
Ready4Dis wrote:
Are you saying that a physics engine should use a BSP and that this has nothing to do with graphics; or are you saying that a physics engine should use an octree and that this has nothing to do with graphics? For physics, using an octree seems like a more sensible approach to me.
I am saying the game might use one of many methods to deal with physics and it may or may not deal with the geometry directly. Different games and situations call for different methods of handling. If you do everything in the video driver it's either going to be really huge or limited without some sort of bypasses for games that need the extra bit of control.
If I have one basket containing 4 oranges and another basket containing 2 oranges, and shift an orange from the first basket to the second basket; will I suddenly have a massive increase in the number of oranges? Yes, the video driver will be larger than it would be if half its work was pushed into the every game, but that is not a bad thing.

If a game developer thinks they "need" an extra bit of control, I don't want them anywhere near my OS. It's far more important to ensure broken ideas from existing OSs don't pollute mine.
Ready4Dis wrote:
You don't. You trace paths across the surface of the track because the tires are always (or at least, nearly always) in contact.
For physics I'm storing the track as splines, because it's stupid to limit yourself to polygons just because someone thought they'd be nice for something completely irrelevant to physics.
That sounds reasonable, and is probably where games are going with dynamically generated polygons based on splines, which allows you to "easily" do LOD, and it might be common by the time either of our OS's is anywhere near ready, not to mention physics will probably be done on the GPU using opencl or simlar anyways.
There's 4 very different things that my video drivers would eventually need to worry about; each with very different APIs/interfaces:
  • Graphics
  • GPGPU
  • Swap space (using spare display memory)
  • Virtualisation/"passthrough mode" (so you can assign the video card to an emulator and let the emulator's guest OS use it).
Ready4Dis wrote:That's a pretty extensive boot list. Mine isn't to disimilar although not being focused on being a distributed OS it is a bit simpler. My boot loader loads my boot manager and initrd. My boot loader generates the inital memory map for the kernel and sets up paging, recovers all ram from boot loader (and sets the memory map that sets its own memory as free). It also boots AP's and they all jump into the kernel in the init rd. The kernel is determined during installation, so there isn't a decision to be made (this may change later). I think read a config file which tells my kernel which drivers to load and in which order. I am working on also setting up a dependency list so I can use all the processors to load drivers from the start. I still need to work on it a bit more, but it works ok for now. Everything is configurable, I don't have an order that it has to start in (VFS can start last if nothing depends on it if you'd like, but if it has a dependency, then it will be loaded before whatever depends on it). Right now I have my video driver inside of the initfs, but this may move depending, or I may do something like set a graphics mode before switching to pmode and render everything with software until a proper driver can be loaded, although that is duplicating effort and putting software rendering code into my boot loader making it take even longer to boot. I am still undecided on this.
Part of the reason for some of the things I'm doing is that I want a generic "live CD" type of thing, where the exact same CD (or USB flash or whatever) can be used to boot many very different computers. Ideally; installing the OS will involve booting with a generic CD like this (that does full auto-detection during boot to determine what is best for that computer), then using some utilities to install a copy of whatever that auto-detection determined is best for the computer onto the hard drive, and then not having any reason to reboot after installing the OS because it'd be the same as what you're already running anyway. ;)


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Ready4Dis
Member
Member
Posts: 571
Joined: Sat Nov 18, 2006 9:11 am

Re: Graphics API and GUI

Post by Ready4Dis »

Some games will describe their graphics as (e.g.) "one node in the graph per room", some games will describe their graphics as (e.g.) "one node in the graph per sector of huge open area", etc; and different games will use different meshes, textures, etc. These differences are trivial for the video driver to cope with.
This is what I was getting at, how the game lays out it's information and how much the driver needs to be aware of.
Because they don't have a sane higher level graphics API to use, game developers implement their own and called it a game engine, and typically do use the game engine's higher level API for many very different games because one size does fit all. This is much harder than what I'm planning because these game engines typically do much more than just graphics alone (e.g. physics, sound, etc).
Yes, but the game engine doesn't always handle all areas of all games the same. Indoor areas are done differently than large outdoor areas. You wouldn't use a portal in an outdoor engine and you wouldn't normally use an octree in a confined indoor area. Of course, the lines are more blurred nowadays, but still. Programmers came up with all of these schemes because they needed a way to minimize the amount of data the GPU had to process and how much data had to be transferred through the PCI/AGP/PCI-e bus, and one single scheme didn't work in all situations. You will have to replicate (at least a portion of) the different methods of efficient rendering of different types of scenes if you want it to be anything useful unless you are only targeting the highest end hardware where it's not as big of a deal if you lose a few fps due to a crappy algorithm. I just think the video driver is going to end up huge or inefficient. I'm not saying anything is a show stopper, and it is plausible that it will work great and make developing games much simpler.

Once you get anywhere with the implementation I would love to play around with it even if it's software only.
Whoever writes a video driver will write a video driver. The initial "raw framebuffer, software rendering only" generic video driver will be written by me and will be open source, specifically so that anyone writing any other video driver can take whatever they want from it (and customise their copy in any way they like to suit their driver and/or whichever video card their driver is for).
I see, so you are going to define the structures and such, and the actual implementer is going to be responsible for tuning it to their hardware (and of course, the actual rendering, etc communications with the hardware).
If I have one basket containing 4 oranges and another basket containing 2 oranges, and shift an orange from the first basket to the second basket; will I suddenly have a massive increase in the number of oranges? Yes, the video driver will be larger than it would be if half its work was pushed into the every game, but that is not a bad thing.
That was kind of my original questions/point, your video driver is going to basically turn into a one size fits all graphics engine (not a full blown game engine due to the lack of sound and physics of course).
There's 4 very different things that my video drivers would eventually need to worry about; each with very different APIs/interfaces:
Graphics
GPGPU
Swap space (using spare display memory)
Virtualisation/"passthrough mode" (so you can assign the video card to an emulator and let the emulator's guest OS use it).
Sounds reasonable. I can definetly see using spare display memory for swap space would be much faster than a hard drive, who determines priority though? If you use 3gb out of a 4gb graphics card, then a new app loads that needs the space in video ram, does the video driver then swap the info to a hard drive to make space (aka, kick the swap out of video ram to disk, or whatever else is available)? Seems like a good idea if you have the spare space (as long as the gpu isn't super busy, as you don't want to be swapping in and out through the pci-e bus if there is a GPGPU program trying to run at max speed sharing said bus), it would be much faster than a hard drive.
Part of the reason for some of the things I'm doing is that I want a generic "live CD" type of thing, where the exact same CD (or USB flash or whatever) can be used to boot many very different computers. Ideally; installing the OS will involve booting with a generic CD like this (that does full auto-detection during boot to determine what is best for that computer), then using some utilities to install a copy of whatever that auto-detection determined is best for the computer onto the hard drive, and then not having any reason to reboot after installing the OS because it'd be the same as what you're already running anyway. ;)
I was talking typical booting, for a live CD I plan on doing the same, the boot manager will load the correct kernel and drivers that I deem necessary to run for a live/install CD. Once installed, there is no reason to have any other kernel in my initial ram disk, just the one required. I do agree though, for a live CD or installation CD, I will load the actual kernel that will run once installed and it can load the drivers without issue. I can't think of anything in my design that would require a reboot after installation.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Graphics API and GUI

Post by Brendan »

Hi,
Ready4Dis wrote:You wouldn't use a portal in an outdoor engine and you wouldn't normally use an octree in a confined indoor area.
I feel like you're missing the main point of the design.

The video driver provides "insert content of node A, B and C into node D" functionality. This functionality can be used by games to do portals, and (dynamically generated or static) impostors, and octrees, and "blocks of blocks of blocks of voxels", and whatever else, and even combinations of these things in the same scene.

For a space game I might have a main node that has:
  • a "foreground HUD" texture, which is another node that's included by this node
  • a "background box" mesh (for distant stars) that uses a set of 6 more textures that are just more nodes included by this node
  • a reference to a "universe" node that is included by this node, where that "universe node" contains:
    • A 3*3*3 grid of 27 "galaxy of space" nodes; where each of them contains:
      • A 3*3*3 grid of 27 "sector of space" nodes; where each of them contains:
        • A 3*3*3 grid of 27 "sub-sector of space" nodes.
Note that I expect you'll recognise that this is almost the same as octrees (except I'm using 3*3*3 grids and not 2*2*2 grids, so it's "twenty-seven-trees" and not "eight-trees").

Any one of those "sub-sector of space" nodes may include:
  • Any number of "space object" nodes (space ship, space station)
  • Any number of "planet" nodes (probably including moons or whatever)
  • Any number of "star" nodes (just a big bright light with a colour?)
A "space object" node might have:
  • An outer surface mesh and textures
  • A set of interconnected "internal room" nodes; which are connected via. the equivalent of portals
When a "space object" is visible but far away, the video driver can auto-generate a distant impostor from the outer surface mesh and textures (and cache it, and recycle cached impostors). When the "space object" is visible but not far away the video driver might use the outer surface mesh and its textures "as is" and not use an auto-generated distant impostor. When the camera is inside a "space object" the video driver uses the "internal room" nodes.
Ready4Dis wrote:
There's 4 very different things that my video drivers would eventually need to worry about; each with very different APIs/interfaces:
Graphics
GPGPU
Swap space (using spare display memory)
Virtualisation/"passthrough mode" (so you can assign the video card to an emulator and let the emulator's guest OS use it).
Sounds reasonable. I can definetly see using spare display memory for swap space would be much faster than a hard drive, who determines priority though?
The video driver determines how much display memory to use for what based on whatever heuristics it feels like. Don't forget that "full featured" native video drivers don't magically appear overnight - I expect that I'll have many video drivers that support video mode switching, framebuffer and software rendering (and maybe page flipping and vertical sync); where a large amount of display memory can be used for swap simply because the video driver isn't finished and doesn't support GPU or GPGPU yet.
Ready4Dis wrote:If you use 3gb out of a 4gb graphics card, then a new app loads that needs the space in video ram, does the video driver then swap the info to a hard drive to make space (aka, kick the swap out of video ram to disk, or whatever else is available)?
Probably; but I can't be sure yet (I haven't properly designed the swap system or the way swap providers interact; and it's the sort of thing that 's prone to race conditions).


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Ready4Dis
Member
Member
Posts: 571
Joined: Sat Nov 18, 2006 9:11 am

Re: Graphics API and GUI

Post by Ready4Dis »

I feel like you're missing the main point of the design.
This is a very good possibility ;)
The video driver provides "insert content of node A, B and C into node D" functionality. This functionality can be used by games to do portals, and (dynamically generated or static) impostors, and octrees, and "blocks of blocks of blocks of voxels", and whatever else, and even combinations of these things in the same scene.

For a space game I might have a main node that has:

a "foreground HUD" texture, which is another node that's included by this node
a "background box" mesh (for distant stars) that uses a set of 6 more textures that are just more nodes included by this node
a reference to a "universe" node that is included by this node, where that "universe node" contains:
A 3*3*3 grid of 27 "galaxy of space" nodes; where each of them contains:
A 3*3*3 grid of 27 "sector of space" nodes; where each of them contains:
A 3*3*3 grid of 27 "sub-sector of space" nodes.

Note that I expect you'll recognise that this is almost the same as octrees (except I'm using 3*3*3 grids and not 2*2*2 grids, so it's "twenty-seven-trees" and not "eight-trees").

Any one of those "sub-sector of space" nodes may include:

Any number of "space object" nodes (space ship, space station)
Any number of "planet" nodes (probably including moons or whatever)
Any number of "star" nodes (just a big bright light with a colour?)


A "space object" node might have:

An outer surface mesh and textures
A set of interconnected "internal room" nodes; which are connected via. the equivalent of portals
I think I am missing one key factor, so let me try to be a bit more specific:

For an indoor game with portals, the main node holds a few things
* HUD/UI
* A list of portals
* A list of objects inside this room as well as its walls, enemies, etc

How does the video engine know which is the current room you're in and which portals are visible? How does it know that it needs to render the portal to see if it's visible and if it is then render everything behind it? Is it aware of you transitioning through said portal to another room to know the new starting point? Which part of this is in the video driver and which parts do the application need to worry about? From the sound of it, the video driver has something equivalent to ports and something similar to an octree, etc.
When a "space object" is visible but far away, the video driver can auto-generate a distant impostor from the outer surface mesh and textures (and cache it, and recycle cached impostors). When the "space object" is visible but not far away the video driver might use the outer surface mesh and its textures "as is" and not use an auto-generated distant impostor. When the camera is inside a "space object" the video driver uses the "internal room" nodes.
This part I understand, it can figure out things that are distant using a bounding box or sphere to figure out what quality level to render at, this isn't my source of confusion, it's more about how 'aware' the video driver actually is and how much 'game engine' like code is going to be in it.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Graphics API and GUI

Post by Brendan »

Hi,
Ready4Dis wrote:I think I am missing one key factor, so let me try to be a bit more specific:

For an indoor game with portals, the main node holds a few things
* HUD/UI
* A list of portals
* A list of objects inside this room as well as its walls, enemies, etc

How does the video engine know which is the current room you're in and which portals are visible?
When the camera is in the lounge room, the lounge room is the main node (and it has portals to the kitchen, passage, etc). When you move the camera to the kitchen the app tells the video driver that the kitchen is now the main node (and it has portals to the lounge room, laundry, etc).

Of course an application doesn't have to do it like that. You could have a main "house" node with rooms as sub-nodes, and use bounding spheres/cubes instead of portals. In that case the app would only tell video driver where the camera is (and wouldn't need to change the main node when the camera moves from room to room).
Ready4Dis wrote:How does it know that it needs to render the portal to see if it's visible and if it is then render everything behind it? Is it aware of you transitioning through said portal to another room to know the new starting point? Which part of this is in the video driver and which parts do the application need to worry about? From the sound of it, the video driver has something equivalent to ports and something similar to an octree, etc.
Imagine the camera is inside a 6-sided cube representing an indoor room. The video driver does its vertex transforms and stuff, then does rasterisation to create fragments. Then it maps textures to those fragments. Some of the fragments are for the room's floor, and the texture for that is something that looks like carpet (and came from the file "/mygame/assets/carpet.texture"), so the video driver maps the carpet texture to those fragments. Some of the fragments are for the room's north wall, and the texture for that looks like brick (and came from the file "/mygame/assets/brick.texture"), so the video driver maps the brick texture to those fragments. Some of the fragments are for the room's east wall, and the texture for that is something that looks like a doorway to the kitchen (and came from rendering the kitchen's meshes and textures from a certain angle), so the video driver maps the kitchen texture to those fragments (but has to recursively generate that kitchen texture first).

This is "portals"; but the video driver didn't provide explicit support for portals. The video driver only provided "render 3D scene to a 2D texture" functionality (which was used to render the main node; and was also used (recursively) to render the main node's "east wall"/kitchen texture). This exact same "render 3D scene to a 2D texture" functionality was also used in the earlier car racing game example (to generate a "rear view mirror" texture, and for dynamically generated impostors).

Now think about how "render 3D scene to a 2D texture" might work. The scene is a collection one none or more objects, and each object has its own local co-ordinate system, it's own vertexes, it's own polygons. You don't want to waste time on objects that are completely behind the camera (or completely outside of any of the viewing volume's clipping planes) so you give each object a bounding box or bounding sphere and test that bounding box/sphere against the viewing volume's clipping planes. Sounds fair to me; but what if an object is just another collection of sub-objects? Nothing really changes much - you check if the object is visible and if it's not you discard the object and all of its sub-objects in one go; and if the object is potentially visible you check if any sub-objects are visible (and do it recursively - if a sub-object is visible, check it's "sub-sub-objects" and so on).

This is "octrees" (or "twenty-seven-trees" or whatever you want); but the video driver didn't provide explicit support for octrees. The video driver only provided "bounding sphere/box tests". Maybe you're not using octrees (or anything like it) at all, and you've just got a several spaceships and planets floating about in a huge otherwise empty area. It doesn't matter. It's the same "skip it if it can't be visible" functionality regardless of whether you decide to use it for octrees or "twenty-seven-trees" or something else.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Ready4Dis
Member
Member
Posts: 571
Joined: Sat Nov 18, 2006 9:11 am

Re: Graphics API and GUI

Post by Ready4Dis »

so the video driver maps the kitchen texture to those fragments (but has to recursively generate that kitchen texture first).
I understand that part, my question was how does the GPU know to first render and see if any of that texture are visible BEFORE deciding to generate it recursively? This is the part that makes it a portal, the fact that it renders and checks if any pixels are visible on screen before it decides to render stuff behind it to actually render whats behind it. The fact that it gets updated dynamically doesn't make it any different, it's the way it's first used to check if it needs to be updated.
The scene is a collection one none or more objects, and each object has its own local co-ordinate system, it's own vertexes, it's own polygons.
Yes, this part makes sense, except that you are testing the bounding (sphere/box) against the view frustum, not necessarily against an objects that could be occluding? These are the subtle differences between a game that works, and a game that works good, the little details of quickly throwing out information based on what type of scene you are handling. If hierarchy of objects with bounding spheres/boxes was good enough, we wouldn't have octree's/quadtree's or any other crazy methods, which are proven to increase performance. In reality graphics cards are getting fast enough that wasting time trying to figure out what not to send takes just as long as sending it, but that isn't the case for slightly older hardware where it might matter. Again, depending on what you are trying to support would change your decisions. My plan is to only support newer hardware, so in my case it would probably be just fine. You are talking about supporting a 486, but I don't think many people would expect to do much gaming on it. But they might have a slightly older graphics card and want to play a game and not have it look worse than the same game on another platform because it's generically made.

I guess I just need to see it actually implemented and working to believe it could actually compete with a purposely built game engine. Most 'all in one' game engines are large and support many different types of scene rendering and I don't see how you'll be able to get around that without lowering quality/speed. Maybe pushing this all into the video driver will make manufacturers come up with some nice tricks that work for their graphics cards best for each type of object/scene/hierarchy.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Graphics API and GUI

Post by Brendan »

Hi,
Ready4Dis wrote:
so the video driver maps the kitchen texture to those fragments (but has to recursively generate that kitchen texture first).
I understand that part, my question was how does the GPU know to first render and see if any of that texture are visible BEFORE deciding to generate it recursively? This is the part that makes it a portal, the fact that it renders and checks if any pixels are visible on screen before it decides to render stuff behind it to actually render whats behind it. The fact that it gets updated dynamically doesn't make it any different, it's the way it's first used to check if it needs to be updated.
It's not "before", but during.

a) Video driver does vertex transforms. This does not need any texture data at all so it doesn't matter if the kitchen texture needs to be redrawn or not at this stage.

b) Video driver does renders the scene down to fragments. This does not need any texture data at all so it doesn't matter if the kitchen texture needs to be redrawn or not at this stage.

c) Video driver does "overlapping fragment occlusion". This does not need any texture data at all so it doesn't matter if the kitchen texture needs to be redrawn or not at this stage. Note: For GPUs/shaders I think they're calling this "early Z test" (and doing it in a half-baked way with z-buffer instead of doing it properly and avoiding the need for z-buffer).

d) Video driver checks the resulting fragments to see if any of them refer to textures that need to be updated. If they do, render those texture/s (using recursing back to "steps a" to convert a completely different scene's data into a texture). Note: Ideally you'd generate a list of textures that are referenced by fragments during "step c" to avoid looking at all the fragments again.

e) Video driver applies textures to fragments. During this stage all textures that need to be updated were already updated.

This is all fairly easy on a CPU; and the first 2 steps and the last step should be easy for GPU. For the middle steps...

The question is; how flexible is the actual GPU? As far as I can tell (assuming "unified shader architecture"); except for the rasteriser (which does what we want anyway); nothing prevents a video driver from doing anything it likes however it likes (e.g. sort of like "GPGPU that generates graphics").
Ready4Dis wrote:
The scene is a collection one none or more objects, and each object has its own local co-ordinate system, it's own vertexes, it's own polygons.
Yes, this part makes sense, except that you are testing the bounding (sphere/box) against the view frustum, not necessarily against an objects that could be occluding?
Yes; I'm not culling objects that are hidden behind other objects; but neither does octrees. For both cases the "hidden behind other object/s" part is done later (with z buffer).
Ready4Dis wrote:In reality graphics cards are getting fast enough that wasting time trying to figure out what not to send takes just as long as sending it, but that isn't the case for slightly older hardware where it might matter. Again, depending on what you are trying to support would change your decisions. My plan is to only support newer hardware, so in my case it would probably be just fine. You are talking about supporting a 486, but I don't think many people would expect to do much gaming on it. But they might have a slightly older graphics card and want to play a game and not have it look worse than the same game on another platform because it's generically made.
For my OS; I wouldn't run a game on a 486, and I wouldn't run a game on a high-end Haswell system either. I'd run a game on 10 computers across a LAN; and if the video card happens to be an old ATI Rage card in a Pentium III, but most of the rendering is being spread across 3 other computers I doubt I'd care.
Ready4Dis wrote:I guess I just need to see it actually implemented and working to believe it could actually compete with a purposely built game engine. Most 'all in one' game engines are large and support many different types of scene rendering and I don't see how you'll be able to get around that without lowering quality/speed. Maybe pushing this all into the video driver will make manufacturers come up with some nice tricks that work for their graphics cards best for each type of object/scene/hierarchy.
Given that the chance of my OS getting full GPU support for anything in the next 10 years is zero; attempting to compete for graphics quality against games designed for things like Windows or systems like Playstation 4 is incredibly unrealistic (equivalent to trying to set the land speed record on a bicycle).

My priorities are making it suit a distributed system, making it easy for programmers to use, and making it as future proof as possible (so that the graphics API, and games and applications using it, still work well on whatever unimaginably freaky hardware we're using long after both ATI and NVidia cease to exist).


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
User avatar
Rusky
Member
Member
Posts: 792
Joined: Wed Jan 06, 2010 7:07 pm

Re: Graphics API and GUI

Post by Rusky »

That may work for portal culling, especially since you're the one writing the video driver, but it does not work at all for occlusion culling, precomputed or otherwise, which is used far more than portals these days.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Graphics API and GUI

Post by Brendan »

Hi,
Rusky wrote:That may work for portal culling, especially since you're the one writing the video driver, but it does not work at all for occlusion culling, precomputed or otherwise, which is used far more than portals these days.
You're saying an example that was only meant to show portals doesn't do occlusion culling (even though, if a portal is occluded no fragments would refer to the portal's texture so everything in the portal is culled)?

Here's another example: Let's have a sub-node for each object; then split the world into a 100*100 grid of cells and have 10000 "main nodes" (one for each cell) where each main node only refers to the sub-nodes/objects that are visible from within that cell (and doesn't refer to any objects that are occluded from that cell). When the camera moves from one cell to another we can switch to the "main node" for the new cell. We could even have arbitrary shaped cells, with any number of "main nodes" that refer to sub-nodes that refer to sub-sub-nodes that.... ; and we could pre-compute a hierarchy this way. It'd only work for objects that don't move, but there's ways to fix that (lookup tables saying which cells are visible from which other cells, and inserting references to any objects that move and are in visible cells into the current "main node").

Here's another example: Let's have a video driver that, when creating a texture the first time sets an "opaque" flag for the texture if it happens to be opaque. Before rendering a 3D scene the video driver selects a few large/close objects where most of the object's textures have that opaque flag set; and render the object's opaque polygons (and nothing else) on an "intermediate z-buffer". Once that's done, render the selected objects like normal. Finally; for all other objects use bounding sphere/box and that intermediate z-buffer to determine if the object is occluded by one of the selected large/close objects and cull the object if it is. This would work without any application/game doing anything at all (and without any change to the graphics API); and could be used in addition to any of the portal/BSP/"multi-main node"/octree techniques.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
User avatar
Rusky
Member
Member
Posts: 792
Joined: Wed Jan 06, 2010 7:07 pm

Re: Graphics API and GUI

Post by Rusky »

Now you're back to skipping the part where the renderer actually decides what's occluded. You can keep adding new features to the video driver all you want, but the closer it gets to a game engine renderer the less achievable and generally useful it will be. The techniques you describe could all work well for some cases, but they will make things worse in others- and those cases are game-specific, so you're going to end up with an insanely overcomplicated panel of knobs and switches.
Ready4Dis
Member
Member
Posts: 571
Joined: Sat Nov 18, 2006 9:11 am

Re: Graphics API and GUI

Post by Ready4Dis »

I understand, I appreciate you taking the time to explain things and answer my questions. I hope you do get your OS going as it seems like a ton of fun to play with. Let me know when you're ready to write the video driver, hopefully i'll have at least one written by then :P. Not all the high level stuff your talking about, but the low level portions hopefully. I think for now I'm going to stick with supporting Vulkan API, seeing as new games are going to support it and it's the next thing in graphics right now. I may change my video driver to a more friendly version like you're talking about a some point, because it does seem much more friendly for the user at the expense of more work for the driver, but that work only needs to be done once instead of for each game.
Post Reply