Hi,
I think I will do what Schol-R-LEA suggested, since it seems the most sensible to me at the moment. The type is indeed a property of the value.
However, semantic analysis is something I will need to do, since the compiler has to ensure, for example, that you are not dividing with zero, or that the returned value of the expression at the right size of the "=" fits in the variable at the left size of the "=". I know this can and will be complex when eventually writing the compiler, but the benefits are overwhelming.
MichaelFarthing wrote:Ideally, of course, the compiler will eventually be written in the language itself, which is a superb test. However, what do you intend to use to get up the compiler initially?
Since I plan to use the language to write the OS eventually, I will need to make a cross-compiler that runs from Linux. When the OS is somewhat mature, I will write the native compiler, which will be written in this language.
Since the initial question has been answered, I think I could speak a bit about the language as a whole. To start, I am going to call it G, since there is no systems programming language called like that, as far as I can tell. There are however other languages called G, but they are mostly domain specific and not well-known.
My general intention is to make programmer errors harder. I am aware this may annoy programmers when trying to get used to it, but I am also aware it will reduce debugging time, since errors will be more rare. Consider a divide-by-zero error. If the compiler can't ensure the divisor is non-zero, it will error right at compilation time. Consider now an out-of-bounds error, which involves using a variable as an index to access an element of a 12-element array. If the variable has the value 12 or greater, or is negative, it will definitely result in an error which, unlike the divide-by-zero error, may not even be evident at runtime. The compiler should be able to ensure the variable is in range in order to compile the code.
There should also be as much as possible well-defined behaviour. Out in the wild there are many programmers relying on undefined behaviour, and this can cause breakage of their programs on different compilers, or even on different versions of the same compiler. It is evident that even experienced programmers put much time into writing code carefully in order not to invoke undefined behaviour. A common case for undefined behaviour is uninitialised variables, and this is something I would rather forbid right from its roots (except for accessing values through pointers, where the compiler can't do anything at compile time). Another option would be to implicitly initialise to zero.
I am thinking of having allowed ranges for variables. A variable representing a weekday would have a range equal to [0, 6] or [1, 7], depending on what you like. Trying to assign the value 8 to it would result in an error, since 8 is out of range.
It should somehow be possible to have bounded arrays starting at some hardcoded address. An example of this is the VGA text buffer, which always starts at 0x000B8000, and is of bounded size. Maybe it could be specified if the curly brackets were omitted in case of specifying an address instead of array elements.
Having multiple types that eventually represent the same type is something I also want to get rid of. In the G language there will be only two cases when something like this will be needed; usize which is a type sized equally to the native machine word size will either be the same as u16, u32 or u64 depending on the target architecture, and isize which is a type sized equally to the native machine word size will either be the same as i16, i32 or i64 depending on the target architecture.
Booleans should not be built on top of integers like in C. Consider the "a + (b == c)" expression. Is there any real use for it?
I plan on having no standard language API as we know it. In C, there are many interfaces that may influence many aspects of OS design. I aim instead for the G language to be mostly independent from the OS-specific standard interfaces.
And, for topping out, I plan having functions that will easily return multiple values. Some people may argue this is syntactical sugar, but I would rather disagree.
This is, in a nutshell, the proposed design. Feel free to discuss about it. I would like to get some feedback and/or more ideas.
Regards,
glauxosdever