Os and implementing good gpu drivers(nvidia)

Ready4Dis · Post by **Ready4Dis** » Sun Nov 01, 2015 8:08 am

Sounds like you have a plan. Using an existing framework will make it much simpler. My intention however is to drop all the dead weight by just starting with a lightweight (close to the metal) api. I want to try to keep my drivers somewhat minimal by dropping any legacy support for things like opengl.
My intentions are similar with the apps submitting a framebuffer to the GUI, although I would more call it register their framebuffer with the GUI. I will eventually worry about making some sort of Qt or Gtk compatibility layer, but that's pretty far in the distance.
Non overlapping windows can make things simpler, you just have to make sure all the apps are smart enough to either resize or change layout when put into a tile that's smaller than full screen, which can make it more difficult for the apps.

AlexHully · Post by **AlexHully** » Sun Nov 01, 2015 11:27 am

Yes reusing a framework is helpful at best.
I need my solution in this life. In a few days actually (huh huh).

So.. I spent some hours on this.
I decided to dump freebsd because the graphic stack there is not on par with linux.

So, it is going to be a massively modified linux kernel (drm, modules..) with a very lightweight weston/wayland.
Most of the code is there already.

It is a single address space system by the way. No more context switches.
The apps running on that platform have to be submited to the internal compiler. That enables not to have an app calling for 'hlt' or anything powerful (catastrophe).
So.. yeah, definitely not general public.

But no nvidia at first.

Since I am doing it for known and restricted hardware, I can massively cut corners.
We will be able to maintain that linux version and provision anything that we miss from the linux source.

As for the desktop: yes actually all the apps are managed by the framework, so the user cannot mess anything

Well, having the gui framework enables to develop an ecosystem around and develop faster after (with an internal language).

So, big plan. Little time.

A functionning kernel that you can heavily customize is the only way to meet the deadline.

This is going to happen.

Brendan · Post by **Brendan** » Sun Nov 01, 2015 12:52 pm

Hi,

AlexHully wrote:This is going to happen.

Are you sure?

Proprietary Nvidia and ATI drivers (the drivers that actually work properly and support 3D acceleration; unlike most open source drivers, especially for recent/new video cards) are provided as a binary blob that run as a normal process (and communicate with a "shim" in the kernel); and therefore won't work with your "single address space internal compiler". The other half of the graphics stack is done via. shared libraries (containing native code) that are dynamically linked; and therefore probably won't work with your "single address space internal compiler" either.

Mostly; you're going to have to write a compiler and tools and executable loader (and dynamic linker); then rewrite half of the kernel's virtual memory management and half of its scheduler and half of all the drivers (so they cope with changes to the kernel's virtual memory management); then find a GPU that has an open source driver that actually works reliably (which will be obsolete junk by the time you get anything actually usable). Then; if you manage all this; you're going to be spending 200% of your time trying to keep up with changes that Linux developers make throughout the kernel (and porting those changes to your kernel).

Also note that (assuming 64-bit 80x86) 128 TiB of virtual address space sounds like a lot; until you realise you've got 200 processes using 512 GiB of virtual space each (for heap, thread local storage, shared libraries, memory mapped files, ASLR, etc) and someone starts a database engine that wants to memory map a huge 50 TiB file or disk partition.

Cheers,

Brendan

AlexHully · Post by **AlexHully** » Sun Nov 01, 2015 1:56 pm

The answer is yes in the context of a general purpose system.

Which is not my case. No updates.

Intel uses wayland. Broadcom does as well. Both are open source. I don't need anything else.

Single adress space: yes! You don't like it as it seems. But where is the technical problem when you control the whole environment? None.

So, yes, it will happen

Brendan · Post by **Brendan** » Sun Nov 01, 2015 7:24 pm

Hi,

AlexHully wrote:Single adress space: yes! You don't like it as it seems. But where is the technical problem when you control the whole environment? None.

Normally (for a general purpose OS) you don't control the whole environment; you only control the OS and don't control third party applications or the user. Your OS may be "different", but you haven't said how or why it's different so I can only assume that "different" just makes it even more important to avoid the unnecessary and pointless pain you're about to inflict on yourself.

Myth #1: Doing radical/extensive modifications to the way a complex project works (e.g. adding support for "single address space" to Linux) is faster than writing something new (that doesn't suck because it was actually designed for your purpose from the start).

Myth #2: Single address space helps performance now. It does for irrelevant/biased micro-benchmarks ("ping pong") which is where this myth began; but for real software the previous process' TLB entries get evicted to make space for the current process' TLB entries and there's no significant difference in TLB miss overheads. What actually happens is that (on modern multi-CPU systems) the overhead of "multi-TLB shootdown" increases dramatically (because you can't assume that one process' TLB entries aren't in another CPU's TLB even when you know that other CPU isn't running the process) and it ends up hurting performance for real software.

Myth #3: Single address space will help performance in future. The fact is that Intel added "address space IDs" to TLB entries about 5 years ago (Intel calls it "Process Context Identifiers"); and this feature completely destroys any benefit of single address space. It takes about 10 years to get an OS to the state where its actually useful (especially if it's different); which means that by the time your OS is usable every (80x86) CPU that anyone still cares about will support "address space IDs".

Myth #4: A 47-bit virtual address space width is enough for all processes anyone could ever want to run at the same time. 4 GiB per process was too small a decade ago (which is why both Intel and AMD went to "64-bit"/48-bit in the first place) and sub-dividing 128 TiB gives you a horrible compromise between "max. space per process" and "max. number of processes". 128 TiB for one process alone is already becoming a limiting factor for some people (massive database servers); and I've been predicting/expecting Intel to extend/replace long mode paging with something better (and larger) for the several years because 128 TiB is too small for one process in some cases (but also because they could reduce TLB miss costs by using larger page tables/directories and reducing the number of levels of indirection involved).

Myth #5: Managed languages are good (note: I've extrapolated from "internal compiler" and "single address space" here). The fact is that to get acceptable performance the compiler and/or JIT and/or language run-time has to be extremely complicated (and therefore has extremely high risk of bugs that compromise "otherwise 100% correct" programs), and that causes a massive increase in attack surface. Basically you're forced to compromise between performance/complexity and security/simplicity; and it doesn't matter how you make that compromise it's always inferior (e.g. either less secure and slightly slower than "unmanaged", or slightly more secure and a lot slower than "unmanaged"; and never faster and more secure at the same time).

Cheers,

Brendan

Rusky · Post by **Rusky** » Sun Nov 01, 2015 10:52 pm

Language/compiler-based protection does not require an "extremely complicated" JIT compiler or runtime, for "acceptable performance" or any other reason. A well-designed, memory-safe language or bytecode compiled by a typical run-of-the-mill compiler on installation could be both faster and more secure. But maybe you don't call it "managed" anymore at that point.

Brendan · Post by **Brendan** » Mon Nov 02, 2015 1:04 am

Hi,

Rusky wrote:Language/compiler-based protection does not require an "extremely complicated" JIT compiler or runtime, for "acceptable performance" or any other reason. A well-designed, memory-safe language or bytecode compiled by a typical run-of-the-mill compiler on installation could be both faster and more secure. But maybe you don't call it "managed" anymore at that point.

Haven't we been through this before? A "managed environment" is an environment where software is relied on to provide protection/isolation at run-time. A "managed language" is something designed for a managed environment (but that's entirely irrelevant because nothing prevents code written in a managed language from being executed in an unmanaged environment, and nothing prevents code written in an unmanaged language from being executed in an managed environment).

If you can (e.g.) inject malicious code into the native executable that was created during installation (e.g. using another OS to tamper with the disk drive) and pawn the entire OS without software trying to stop you at run-time; then it's not a managed environment.

Cheers,

Brendan

AlexHully · Post by **AlexHully** » Mon Nov 02, 2015 1:27 am

The language is managed at the compilation time. Not at runtime. So we get the best of both worlds.

SASOS is also able to reduce management cost.
You say that a system can consume electricity, and that it is not a concern. It is. No matter how fast it is done.

And context switches are among those things (stack, permission management) that cost cpu cycles and add up. No matter how fast they are.

I propose to post a benchmark sasos and the normal way in real conditions. If proven wrong, i would have wasted a few days. Otherwise.. Actually, the only thing I can do is timing during an hour in the exact same conditions and use performance monitor to make an overall -and possibly flacky- decision.

I talked about broadcom because of their SoC. All those efforts that can seem not legit become a paramount when embedded.

But still. It will not take ten years to create. Single address space is just a few days of work (when you specialize your context).

Back to the graphic stack: we are not that far from wayland/weston implementation. Actually quite close. Just avoiding multiple ifs and switch cases.

Again, it might not be the best solution, but it made to work.

AlexHully · Post by **AlexHully** » Mon Nov 02, 2015 2:41 am

What is the difference between a tagged tlb after a context switch in a "normal" system and a sasos?

When there is a miss, you end up doing the exact same thing possibly at the exact same rate. Same thing for a hit.
But you don't change the stacks don't make permission checks.

Brendan, for a general purpose computer, I would not be doing this.

embryo2 · Post by **embryo2** » Mon Nov 02, 2015 7:55 am

Brendan wrote:If you can (e.g.) inject malicious code into the native executable that was created during installation...

And what is the difference between managed and unmanaged OS in case of malicious code injected in some sensitive area? Are you trying to tell us that there's nothing absolutely secure? Yes, there's nothing. But it's senseless to discuss issues like "A-bomb can melt any armor". It's better to discuss why a particular thing is better. Are "unmanaged" OSes protected from code injection? No. So, why you are using such argument against "managed" OS? I miss your point here.

Ready4Dis · Post by **Ready4Dis** » Mon Nov 02, 2015 10:23 am

embryo2 wrote:
Brendan wrote:If you can (e.g.) inject malicious code into the native executable that was created during installation...
And what is the difference between managed and unmanaged OS in case of malicious code injected in some sensitive area? Are you trying to tell us that there's nothing absolutely secure? Yes, there's nothing. But it's senseless to discuss issues like "A-bomb can melt any armor". It's better to discuss why a particular thing is better. Are "unmanaged" OSes protected from code injection? No. So, why you are using such argument against "managed" OS? I miss your point here.

The point was, that using a managed compiler doesn't really offer security to the operating system if every single process has access to every other processes (and kernels) memory. It just means someone can write a program in an unmanaged language and compromise you kernel, where if you separated things out into their own memory address spaces, they are more secure. If you byte-code interpret everything in it's own sandbox, it can be even more secure... you can make an OS so secure that nobody could ever use it

. The point was, if there was a simple way to make it very secure without sacrificing performance, everyone would be doing it by now.

Rusky · Post by **Rusky** » Mon Nov 02, 2015 10:52 am

Brendan wrote:Haven't we been through this before? A "managed environment" is an environment where software is relied on to provide protection/isolation at run-time. A "managed language" is something designed for a managed environment (but that's entirely irrelevant because nothing prevents code written in a managed language from being executed in an unmanaged environment, and nothing prevents code written in an unmanaged language from being executed in an managed environment).

Then I guess we're not talking about a managed environment or language. Call it something else if it makes you happy. The point is that you could enforce a particular verifiable program representation that is at least as fast (more, once you start doing optimizations like inlining between applications and the kernel) and just as secure as a hardware-protected system running unverified code.

Brendan wrote:If you can (e.g.) inject malicious code into the native executable that was created during installation (e.g. using another OS to tamper with the disk drive) and pawn the entire OS without software trying to stop you at run-time; then it's not a managed environment.

This is not a relevant concern- you can do exactly the same thing to the kernel or even firmware if you have physical access to the hardware, regardless of whether the kernel uses software or hardware or both for security.

AlexHully · Post by **AlexHully** » Mon Nov 02, 2015 11:44 am

Ready4Dis wrote: It just means someone can write a program in an unmanaged language and compromise you kernel

And you are right.

This is managed by having.. just one possible language on that platform

that has to pass the compiler test. It is not gcc or other "dumb" compilers out there.
We may release it too, but the use is so contrived..

Every situation is different and depends on the level of control you plan to have.

gerryg400 · Post by **gerryg400** » Mon Nov 02, 2015 3:18 pm

AlexHully wrote:But still. It will not take ten years to create. Single address space is just a few days of work (when you specialize your context).

So it will be finished this week ? I look forward to seeing it.

Brendan · Post by **Brendan** » Mon Nov 02, 2015 5:28 pm

Hi,

AlexHully wrote:The language is managed at the compilation time. Not at runtime. So we get the best of both worlds.

That is not managed at all.

Lets look at compilers and give them a rating from 1 to 100 representing how good they are at detecting bugs at compile time; where 1 represents "only able to detect syntax errors" and 100 represents "able to detect all possible bugs". Assemblers would get an extremely low rating (maybe 1). C compilers are better because there's some type checking and other features (maybe a rating of 15). A compiler for a language like Rust might get a much higher rating (maybe 50).

What you want is a compiler that gets a rating of 100 on this scale. Except for that; it's no different to Rust or C or assembly.

Ironically, I also want a compiler that's able to detect as many bugs as possible during compile time because it's far better to tell the developer there's a bug and refuse to compile before software is released than it is to blow up in the end user's face at run-time after software is released. This is also part of the reason why I think managed environments (e.g. Java, which can blow up in the end user's face when the JVM detects a variety of problems) are a worthless joke - they don't detect problems until after it's too late.

The difference between you and me is that I've spent time researching what is/isn't possible and I know that getting a rating of 100 on that scale is impossible, and you probably haven't even started thinking about how you're going to achieve it. With this in mind; prepare yourself for a crash course in "anti-crash" technology!

Imagine there's a line of code like this "x = myArray;". You have to guarantee that the range of possible values of i is within the bounds of the array (e.g. if its an array of 100 things, then you have to guarantee that i can only hold a value from 0 to 99). This mostly means the compiler must track the maximum possible ranges of all variables.

Now imagine "i = abs(1000/y); x = myArray;". Division is annoying. If its an array of 100 things; then this code is safe if i ranges from 0 to 99; and if we know that y ranges from -99999999 to +99999999 that does not help to determine if i ranges from 0 to 99. You need to know that the possible values of y exclude the range from -10 to +10 (because values in that range cause i to be greater than 99). Because of division; for a compiler to track the maximum possible ranges of all variables, a compiler must actually track the ranges of possible negative values and the ranges of possible positive values separately.

Now imagine this:

Code: Select all

    if( (myFloat > 100000.0) && (myFloat < 100000000.0) ) {
        m = myFloat + 3.0;
        n = m - myFloat;
        o = n / 3.0;
        p = o - 1;
        i = p * 123.0;
        x = myArray[i];
    }

Because of the "if" you know that:

m will range from approximately 100003.0 to approximately 100000003.0
n will range from approximately 3.0 to approximately 3.0
o will range from approximately 1.0 to approximately 1.0
p will range from approximately 0.0 to approximately 0.0
i will range from approximately 0.0 to approximately 0.0

However, single precision floating point doesn't have infinite precision and (for large values) "myFloat + 3.0" may equal myFloat; which actually means that n may be zero, o may be zero, p may be -1 and i may be -123. For some values of myFloat, n may be 0.5, o may be approximately 0.166666, p may be approximately -0.833333 and i may be -102.5.

Basically; "approximately 0.0" isn't good enough. For a compiler to realise this code is not safe it must keep track of the precision of all variables (in addition to the range of possible positive values and the range of possible negative values).

Now consider this:

Code: Select all

    m = myDouble + 1.0;
    n = m - myDouble;
    o = n - 1;
    if( (myDouble > 100000.0) && (myDouble < 100000000.0) ) {
        i = o * 123.0;
        x = myArray[i];
    }

This code is safe; because double precision floating point does have enough precision for the compiler to know that when myDouble is in the range from 100000.0 to 100000000.0 the value in o will be zero. However; if myDouble could be anything from -infinity to +infinity at the start of the program, before the "if" you have to assume that the possible values in o include "NaN". How does the compiler know that o will be zero after the "if"?

There are 2 answers; but the first answer (keep track of the relationships between all variables) is extremely impractical. The second answer is that the compiler has to transform the code into something like this:

Code: Select all

    if( (myDouble > 100000.0) && (myDouble < 100000000.0) ) {
        m = myDouble + 1.0;
        n = m - myDouble;
        o = n - 1;
        i = o * 123.0;
        x = myArray[i];
    } else {
        m = myDouble + 1.0;
        n = m - myDouble;
        o = n - 1;
    }

This makes it easy for the compiler to see that the code is safe without having to track the relationships between variables. However, this trick does not work for more complicated cases (e.g. when o is calculated by a completely different function and stored as a global variable 3 days before this code is called).

Now consider something like this:

Code: Select all

    scanf ("%u",&user_input);
    c = abs(user_input) & 0x1F + 1;
    b = 1.1 * c;
    do {
        b = b * 1234.5678;
        a = b % 1.0;
    } while (a != 0);
    i = b/99999.0;
    x = myArray[i];

Can you guarantee that i be in the range 0 to 99? You know that c will be from 1 to 32, which isn't very many values. In theory you can execute the loop in an interpreter or something to determine what happens for each of value of c; and monitor the value of b to determine if it repeats a previously seen value of b (to detect if the code loops forever or not). In that way you could determine if it's safe, and if it is safe you can convert the code into this:

Code: Select all

    scanf ("%u",&user_input);
    c = abs(user_input) & 0x1F;
    if(lookup_table[c].haltsFlag) {
        for(;;) { }
    }
    b = lookup_table[c].b;
    a = b % 1.0;
    c = c + 1;
    i = b/99999.0;
    x = myArray[i];

The first problem here is storing all previously seen values of b. For example, (assuming b is 64-bit double precision floating point) you might need 1 bit per possible value, or (once you realise b can never be negative and you don't need the sign bit) a total of about 1048576 TiB of virtual address space, and to achieve that (on 64-bit 80x86) you need a minimum of 8192 processes each with 128 TiB of space. Note that this is only space and not RAM (you'd hope that nearly all pages aren't modified) so it's technically possible on sane OSs. Alternatively you could have a list of "previously seen values of b" and search it, but that's not likely to be fast and (for worst case) would consume 64 times as much virtual address space. Faster alternatives (e.g. binary trees) consume even more RAM for worst case.

The second problem is how long it's going to take. Most people don't want to wait for 3 centuries while their code compiles; so you have to set some sort of time limit and give up if that time limit is reached.

That brings me to unsolvable problems. What does the compiler do if it can't possibly determine if a piece of code is safe or not?

The only thing it can do is "false negatives". If it can't determine if a piece of code is safe or not it has to assume the code is not safe. This means that some 100% safe code will fail to compile. For an extremely simple compiler you get a lot of false negatives and a lot of 100% safe code fails to compile; and for an extremely complex compiler more 100% safe code compiles. Note that I do not consider this a deal breaker - it just means that programmers get annoyed at you because they have to modify 100% safe code just so the compiler is able to realise it is safe. What it does mean is that a very complex compiler is required (just for the safety checking alone, ignoring optimisation, etc) to ensure programmers are able to tolerate it. Once you combine the complexity involved in safety checking with the complexity involved in modern optimisation techniques; we're no longer in the realm of ordinary "very complex" - we're talking about pushing beyond anything that all of the compiler developers who came before us have achieved into a whole new category of "complex".

Now; what do we know about "very complex"? We know that there's a strong relationship between code complexity and bugs. This is why we want to sophisticated compiler that's able to detect bugs in the first place. You must assume that a compiler that's complicated enough to be tolerated by programmers and also complicated enough to do acceptable optimisation is also full of bugs and therefore you must assume that the compiler is unable to guarantee that the code it compiled is safe.

Let me repeat that last part for you:

You must assume that the compiler is unable to guarantee that the code it compiled is safe.

Cheers,

Brendan

OSDev.org

Os and implementing good gpu drivers(nvidia)

Re: Os and implementing good gpu drivers(nvidia)

Re: Os and implementing good gpu drivers(nvidia)

Re: Os and implementing good gpu drivers(nvidia)

Re: Os and implementing good gpu drivers(nvidia)

Re: Os and implementing good gpu drivers(nvidia)

Re: Os and implementing good gpu drivers(nvidia)

Re: Os and implementing good gpu drivers(nvidia)

Re: Os and implementing good gpu drivers(nvidia)

Re: Os and implementing good gpu drivers(nvidia)

Re: Os and implementing good gpu drivers(nvidia)

Re: Os and implementing good gpu drivers(nvidia)

Re: Os and implementing good gpu drivers(nvidia)

Re: Os and implementing good gpu drivers(nvidia)

Re: Os and implementing good gpu drivers(nvidia)

Re: Os and implementing good gpu drivers(nvidia)