Page 2 of 2

Re:writing secure software

Posted: Wed Nov 24, 2004 11:22 am
by Candy
mystran wrote: The big question then, is whether human beings are capable of performing computations other than those possible for any Turing Equivalent system. If we assume that humans are not, then it is possible to write programs for which it's impossible for humans to prove whether it halts (on some specific input) or not. And this is why security must be a process.
No.

If you can prove any N programs to be correct, you can not prove ALL programs to be correct. Yet, you can prove those N programs to be correct.

Humans have the advantage of not being required to determine themselves to be correct (or ending), so they only have to deal with programs not their own. Now, that's a doable task. At least, given infinite time (and money).

Re:writing secure software

Posted: Thu Nov 25, 2004 8:55 pm
by mystran
Candy wrote: If you can prove any N programs to be correct, you can not prove ALL programs to be correct. Yet, you can prove those N programs to be correct.
Excuse me, but what are you trying to say? Sure you can prove some programs, at least up to the correctness of that prove. I didn't claim otherwise.
Humans have the advantage of not being required to determine themselves to be correct (or ending), so they only have to deal with programs not their own. Now, that's a doable task. At least, given infinite time (and money).
But it's not doable in finite time. For a proof to be proven correct, we require:
"for a proof to be known to be correct, a proof of that proof known to be correct is known"
Let proof of x be p(x). Now, if x is known to be correct, that implies that there exists p(x), which is also known to be correct.
So we have a set of all (known to be correct) proofs:
P = [x | p(x) is a member of P]
Now, since p(x) is a member of P, it is also "an x for which p(x)" exists. Now, let p(p[sub]n[/sub](x)) = p[sub]n+1[/sub](x) and you get: for all x (in P) there exists p[sub]n[/sub](x) for all natural numbers n, ie. the chain of proofs is infinite for any x. Proven by induction:
1. p[sub]0[/sub](x) is member of P if x is member of P, by definition.
2. assuming p[sub]k[/sub](x) is member of P, by definition p(p[sub]k[/sub](x)) is member of P, and p(p[sub]k[/sub](x)) = p[sub]k+1[/sub](x), hence p[sub]k+1[/sub](x) is member of P.

This means that if there exists a single x that is member of P, then there exists infinite number of x that are member of P.

Now, if we assume that a proof to be discovered always takes at least e seconds (where e is smallest possible real number greater than zero, I'd used epsilon but couldn't conquer it from my keyboard) then for P to be non-empty, at least e*inf seconds must have passed. That is, full certainty cannot be reached, unless we can start proving things in zero time, or have spent an infinity of time in constructing proofs.

Naturally there is a possibility that my proof is not sound. See above. In fact, there is at least one notable loophole:
if x = p(x) then for all natural n, p[sub]n[/sub](x) = x:
1. p(p(x)) = p(x) = x
2. if p[sub]k[/sub](x) = x, then p[sub]k+1[/sub](x) = p(p[sub]k[/sub](x)) = p(x) = x

That doesn't help much though, since AFAIK the only known circular proof goes something like "there is some thinking going on" on the basis that the thought "there is some thinking going on" is thought.. or .. something like that. But it proves my point that a "sound-looking" proof is not enough to assume correctness.

PS. Can we get a math-mode in this board? :)

Re:writing secure software

Posted: Fri Nov 26, 2004 1:15 am
by Candy
I was hinting more at inductive proof for the programs. For an arbitrary program you can describe all possible inputs and outputs (say, an empty program). For each possible input you add to it, you can describe all possible variations into classes of inputs and describe the classes of outputs that should come from them. If you base the proof for program x[sub]n+1[/sub] on the proof for program x[sub]n[/sub] you can inductively prove that given a base case x[sub]1[/sub] you can prove the entire program to be correct for all possible inputs and outputs.

My point remains however, this is about three pages of calculating work per function of 10 lines, and given software sizes that are regularly around 20000 and much more lines, this is a load of work that's larger than writing a bible all by yourself.


edit:
Now, since p(x) is a member of P, it is also "an x for which p(x)" exists.
This is exactly where I think this is different. You're not implying that p(x) implies p(p(x)). The proof that program X works with all inputs doesn't prove that I myself work. You are not even trying to prove p(p(x)), therefore p(x) can exist without an infinite amount of work.



Practical example: you make a program that takes two numbers and tries to print them. It has written its own strtol. It can thus take these input classes:
- Valid input (two numbers)
- Only one number
- One number and trash
- Trash followed by a number
- pure trash
- nothing
- an EOF
- valid input with a faulty separator

You can check to see if there's separate handling code for all these cases, and if there is, the program works for all inputs from these classes. Since I've just stated (not formally) that these are all possible inputs, the program works for all possible inputs -> the program securely works.

Same with network code. If you only test the first N bits of a packet (say, IP or TCP code) you can prove for all possible inputs (by seeing the subdivisions in the code and the practically possible inputs in that many bits) that the code handles it correctly -> you can prove that the program handles it correctly at all times -> securely proven to work.


edit2: I see the difference now... We are proving x to be correct for all inputs, we only have to show p(x) to be correct for this x, not all. p(x) is therefore not a part of those proofs.

Re:writing secure software

Posted: Fri Nov 26, 2004 2:11 am
by distantvoices
why not proving by falsification? seems to make more sense in practical computing.

I test a program. It works. I try to find out what needs to happen in order to render it unworking. Proof by falsification.

BTW, the most secure programs are those taking input and handle it in fault tolerant way. No crapping out, no buffer overflows - either it works or you get an error notification, but you never encounter nice things like seg faults.

Re:writing secure software

Posted: Fri Nov 26, 2004 2:30 am
by Candy
beyond infinity wrote: why not proving by falsification? seems to make more sense in practical computing.

I test a program. It works. I try to find out what needs to happen in order to render it unworking. Proof by falsification.
That's what happens to Windows right now. Each time somebody finds something that makes it crap out, they fix it (after a while).

Point is, you cannot prove the software to be secure if you only try to nonstructurally find points where it fails. You can't test all inputs, since that would takes millennia even for a cluster testing an ipv4 implementation, so if you can cluster all inputs in categories and test all categories (for one or more cases) you can thus prove it to be correct, by proving that it correctly handles all categories of possible inputs, therefore all inputs.

You need to do this per function though, just a superficial check is usually not enough. The entire program will need substantial memory function and that memory function is usually where things go wrong, and you can't test all those cases in any fairly overviewable way. You can prove each function to be correct based on the functions it uses, and thereby all functions to be correct assuming the OS is correct. For OS devers here (yep, good forum): You can prove all things to be correct as far as hardware behaves, and you can even specify it to work for all failing hardware but two or three things as basic as the processor, memory and buses. That is as far as you can make it work. For the rest, try failsafe (two-three computers doing the same, and resetting each other when one fails) or just plain using military class computers.
BTW, the most secure programs are those taking input and handle it in fault tolerant way. No crapping out, no buffer overflows - either it works or you get an error notification, but you never encounter nice things like seg faults.
True... be back in 2.5 hours, got driving class right now, will answer this later :)

[2.5 hours later]
That's true... yet, most fault tolerant programs still need to be checked with all possible faulty inputs, at least a significant number of them. Most programs you are proposing have no memory function however, and are therfore quite simple to test given only the interface. I can test a list structure (C++-ish) by trying a bunch of values, dumping them in and getting them back out. But, what if I have two iterators pointing at object X, delete the content of one of them and then increment the other? That's something you're not going to find using normal testing.

As for the point of functions without memory function, they are all testable using all possible inputs and checking the output. See the above posts.

Re:writing secure software

Posted: Fri Nov 26, 2004 2:47 am
by distantvoices
Of course, you 're right here. Testing unstructured - out of the wild, for any values - is not an option.

I usually test an *input testing* function against *known* (attention, I know, in the *known* are some pitfalls) extreme values. The functions which are doing the math - and other background work - shall not be busied with invalid values.

Taking and testing input is matter of the *interface*, not of the *manipulation* nor of the *output*

as for tcp/ip stack implementation: This is a point where most fault tolerance is asked for: you check for *needed* values and checksums, and if they are not given/correct, reject, send nack or so. As each protocol layer has a function which *takes* a packet and does the initial processing (testing for header correctness and checksum) before handing it over to the bigger internal functions, there needs to be error checking in each layer. Well, I gonna elaborate on this more when I 've managed to implement a tcp/ip protocol suite myself.

Re:writing secure software

Posted: Fri Nov 26, 2004 4:47 am
by Candy
beyond infinity wrote: I usually test an *input testing* function against *known* (attention, I know, in the *known* are some pitfalls) extreme values. The functions which are doing the math - and other background work - shall not be busied with invalid values.

Taking and testing input is matter of the *interface*, not of the *manipulation* nor of the *output*
Objects with memory are not testable with only the interface. You can only test them given an interface and a memory state description. Plus, since most of them modify the state, you'd need to determine how to test all functions on all states in which they are valid, plus those that aren't valid. That's not simple (in fact, I've never seen anybody do that right).
as for tcp/ip stack implementation: This is a point where most fault tolerance is asked for: you check for *needed* values and checksums, and if they are not given/correct, reject, send nack or so. As each protocol layer has a function which *takes* a packet and does the initial processing (testing for header correctness and checksum) before handing it over to the bigger internal functions, there needs to be error checking in each layer. Well, I gonna elaborate on this more when I 've managed to implement a tcp/ip protocol suite myself.
You don't segfault on bad values, but crapping out isn't just segfaulting or something. AFAIK, it stands for changnig the course from a servicing attitude to a rejecting attitude. Whether you segfault or send a NAK because of that, I don't care.

That said, you can test all these layers separately, and I fully intend to do so. Still, you have a problem testing the TCP section, since it has memory (feel like a repeater now..). You have a lot of things that can go wrong, including timeouts and delayed packets.

Re:writing secure software

Posted: Fri Nov 26, 2004 5:36 pm
by mystran
Candy wrote:
Now, since p(x) is a member of P, it is also "an x for which p(x)" exists.
This is exactly where I think this is different. You're not implying that p(x) implies p(p(x)). The proof that program X works with all inputs doesn't prove that I myself work. You are not even trying to prove p(p(x)), therefore p(x) can exist without an infinite amount of work.
But isn't it so, that if we simply assume that we can write correct proofs, then we could also assume that we can write correct programs and not bother about prooving them. After all, if we don't make errors in proofs, then why would we make errors in writing programs either?