Brendan wrote:For this case; the compiler would see "x = x + 1;" and evaluate the range of the result of the expression on the right hand side (if x ranges from 0 to 999 then x+1 must have a range from 1 to 1000). Then (for assignment) the compiler checks that the left hand side is able to store that range of values, and generates a compile time error because x can only store a value from 0 to 999.
The programmer would have to fix the error. This might mean doing "x = (x + 1) % (x.max + 1)" if they want wrapping, or doing "x = min(x+1, x.max);" if they want saturation, or adding a check immediately before it, or adding a check somewhere else entirely, or increasing the range of values that x can hold, or changing it to "x2 = x + 1;", or ....
If the programmer does happen to do something like "if(x >= x.max) { return FAILED; }" then this is no different to any branch in any language. It's not a run-time check inserted by the compiler itself.
OK, this gives me some idea of your intentions. From this, I gather your intent is that the compiler flag an error (or at least a warning) on invalid possibles. This seems reasonable to me, and I had much the same thing in mind (though only for builds with a version status of 'release-candidate' and above). It also tells me that you are differentiating from the compiler inserting runtime checks automatically, and the compiler requiring the programmer to include manually inserted runtime checks.
This last part is of interest to me, as I am not sure I see them as being as clear-cut as you seem to be asserting. For example, it would seem (to me) that there is no reason this
couldn't be automated, at least partially. For example, let us consider the possibility of allowing the client-programmer to define range types (either as named types or as one-off subtypes of the integer type) in which a specific behavior is defaulted:
Code: Select all
(def Foo-Range
(constrainted-type Integer
(cond (((< _ -250) (set! _ -250) _)
((>= _ 750) (raise-overflow-exception! _))
(else _)))))
(This is just off the top of my head, but sort of what I have in mind; I would probably have a specific ranged-type c'tor that would cover this common case:
Code: Select all
(def Foo-Range
(ranged-type Integer
:underflow #saturate
:overflow (raise-overflow-exception! _)))
or something like it.) Similar things could be done with ranged types in (for example) Ada, though IIRC in that case the default (and only) behavior is to raise a standard exception. Furthermore, using an explicit range could allow the compiler (or library macros, in my case) to automatically optimize the variable's memory footprint (by using a 16-bit value instead of 64-bit one, for example). Of course, range checking is just one example, but the point is, there are ways in which this can be automated which would still give the client-programmer fine control over the handling of edge cases.
Also, the premise of pre-analyzing all possible paths has the potential of running into a combinatorial explosion. At some point, the compiler would have to simply give up and reject the code, agreed? That may make certain kinds of code impossible to write with such restrictions in place. This is not a very likely scenario, so I don't know if it is worth giving too much weight to it, but it would have to addressed at least in the documentation and error reporting.
There's another case which would be problematic (well, in most languages, anyway - there are languages where it is possible to programmatically generate types at runtime, but the overhead is quite steep), which is where the range itself is set at runtime. There really would be no clear-cut way to check that ahead of time in a statically-typed language, so the compiler would in your scenario require every access to the value to be explicitly tested. Since this is precisely the scenario we are looking to avoid, it is hard to see how not testing automatically would be of benefit. Again, an unlikely scenario, but something to give consideration to.
I gather, you would consider explicit checks preferable in any of these scenarios, is this correct?
Schol-R-LEA wrote:Note that I have not given the context of this use case - neither whether it is part of a loop condition or simply some arbitrary part of the program logic, nor whether the increment or the range are explicitly defined in the program or not. Context may be relevant to your solution, I will grant, but I want to first consider the general case before moving on to specific cases.
Brendan wrote:The context only really effects what the compiler thinks the previous range of values in x could be.
I would say it can affect a good deal more than that. For example, if the value is the index of a definite loop, and the compiler detects that the body of the code had no effect, then the loop itself (including the increment) can be optimized away.
OK, that's not really a fair example, but consider (as an example) a counting loop where the index is not explicitly used for any other purpose; depending on the processor, it may be possible to reverse the increment to a decrement in order to use a simpler exit condition, or replace the loop with with non-conditional repetition (e.g., REPZ MOV RSI, RDI). Whether it would be possible to find such potential optimizations (and know when they would make a difference) might not be feasible, but that's not the point: the point is that context can affect how you compile a particular part of a program.
Brendan wrote:Basically it comes down to a design choice. A compiler may:
- guarantee there are no false positives (e.g. overflows) at compile time; which makes it impossible to avoid false negatives (e.g. "nuisance" errors) at compile time, or
- guarantee there are no false negatives (e.g. "nuisance" errors) at compile time; which makes it impossible to avoid false positives (e.g. overflows) at compile time
The first option is what I'm planning. It's harder to write the compiler and makes things a little more annoying for programmers when they write code.
Excellent, this gives us all a lot better understanding of your intentions and motivations, I think.