Page 4 of 4

Re: Opinions On A New Programming Language

Posted: Mon Feb 12, 2018 2:23 am
by Wajideus
alexfru wrote:Where does your language stand w.r.t. this problem? Is it for machines or for people?
This question is kind of tricky to answer, because it's for both. On the machine side of things, the overall goal is to give you much more expressive power in regards to how your data is laid out in memory and the relationships between pieces of that data. Contrary to popular opinion, I care very much about this, because data locality has a huge impact on performance.

On the human side of things, I've gone far out of my way to provide abstractions that exist solely to reduce complexity.For example, being able to bake parameters into functions to reduce the number of parameters that a function takes, being able to "use" substructs to flatten nested data structures, adding import bundling to flatten nested namespaces, allowing members of structures and function arguments to express meaningful relationships between each other via DAGs, separating interfaces from structures so that you can very easily add / remove / change components of a system without requiring a cascading recompilation, being able to use "use" to steal () and [] operators from functions and arrays to allow for extremely simple implementation of generators / coroutines / continuations / sparse arrays / dictionaries / etc.

In a lot of ways, the language is both far lower-level than C and far higher-level than C++; and syntactically, it's not even 1/4th as complex as C++ to comprehend. Much like C, what you see is what you get, and there's no surprises.


EDIT:
To be honest, if I had to put my finger on it, I'd say that a majority of my design decisions are guided by looking at how real-world problems are solved at a low-level, and then figuring out how to simplify the implementation of those specific solutions. For example, in my previous post, I was looking at the problem of "how can I pipe functions together in such a way that the size of the input and output data varies?". This is actually a fairly complicated problem to tackle, but it matters when you're dealing with data that's being streamed or generated; for example, in the implementation of a graphics pipeline.

And as I mentioned, although it's possible to do with the current syntax of the language, I'm not entirely happy about it specifically because it's complex to implement; so I'm working on making that more intelligible on the human side.

Re: Opinions On A New Programming Language

Posted: Wed Feb 14, 2018 1:44 pm
by Wajideus
Two new ideas I had. The first is using proto in a struct to implement a tagged union.

Code: Select all

struct variant {
    use proto type;
    use union value {
        int i;
        float f;
    }
}

variant obj = 0.5;
if (obj.type != float) {
    puts("it should be!");
}
obj = 4;
if (obj.type != int) {
    puts("it should be!");
}
The second I don't have a syntax for yet, but the idea is more or less to have lazy constructors. What I mean by this is that constructors would be like a tree of diffs that's lazily evaluated. This would allow you to very easily implement things like infinite-precision algebraic types, ropes (string builders), and syntax trees with high-performance; because you're not copying and moving huge swathes of memory or doing potentially expensive operations that could be avoided.


EDIT:

Another idea, using interfaces to attach property accessors:

Code: Select all

interf lp {
    int len;
}

impl lp(str s) {
    len = strlen(s);
}

proc run() {
    lp str s = "hello world!";
    if (s.len != 12)
        puts("it should be!");
}

Re: Opinions On A New Programming Language

Posted: Thu Feb 15, 2018 1:49 am
by Wajideus
A small update to the implementation syntax:

Code: Select all

interf Updateable {
    proc update();
}

impl Updateable(Actor *actor) {
    proc update() {
        UpdateActor(actor);
    }
}


proc UpdateActor(Actor *actor) {
}

proc run() {
    Updateable Actor actor;
    actor.update();
}
Not sure if I mentioned it before, but this language abstraction actually solves some huge problems in object-oriented programming.

The begin with, in OOP, types implement interfaces. This is an extremely bad idea, because you have to plan ahead of time all the ways in which a type will be used and implement them. You almost always don't know enough at the start of a project to be able to do that, and a lot of times, you end up forgetting something so important that it requires a complete rewrite of the system to fix it.

But it gets worse. When you use inheritance, you start structuring your data based on behavior, and when one point in the hierarchy changes, those changes cascade down the entire hierarchy. And when you end up creating 2 types that get used in the same way, you have reparent the classes or factor the methods out into a separate interface and refactor a bunch of classes to implement that interface. It's such a pain in the @$$ to do that languages started adding features like mixins and traits to bandaid the problem.

Now here's where my implementation differs (no pun intended).
  • An implementation is an instance of an interface that's associated with some type.
  • Interfaces are attributes which can be applied to any type.
In the above example, when I say `Updateable Actor actor;`, `Updateable` is the name of the interface and `Actor` is the name of the type. The compiler sees this and immediately starts looking for an implementation of `Updateable(Actor)`. If it can't find one, it's an error. Otherwise, `actor` is declared as a pointer to the implementation of `Updateable(Actor)` followed by an `Actor`. In fact, any cast of `Structure -> Interface Structure` will add a pointer to an implementation and any cast of `Interface Structure -> Structure` will remove that pointer.

The side-effect of this is:
  • You can add new interfaces and implementations without breaking or changing any existing code.
  • You can create multiple implementations of an interface for a single type, based the current scope.
  • If you're not using any methods, then there's no internal pointer to a vtable being stored in the type; which can save you several thousand bytes of cache space if you're iterating over arrays.
  • You can combine interfaces with generics. For example, `render(Renderable proto T, T thing);`, which will get inlined as `render(Renderable T thing)` like C++ templates. This functionality is similar to the "where" type-constraints in C#.
  • You get to explicitly control whether an implementation passes `this` by value or by reference
Pretty much all of the problems with OOP just disappear. Even the pre-planning process goes away, because you can just write and change implementations as you need to instead of thinking about how it's going to affect the design and future of an entire system. And when you do need to change an interface, you'll find that the impact it has on your code isn't that substantial. Why? Because most functions only use one or two methods which are actually associated with an instance, not a typeclass. As a result, interfaces tend to be really small and highly specialized.


EDIT:

Another small thing I forgot to mention; interfaces are not meant to be used everywhere, all the time. They're a tool that exists specifically to deal with cases where the type of something isn't known until runtime (specifically, virtual methods). In pretty much all other situations, you should be using regular functions.

Re: Opinions On A New Programming Language

Posted: Fri Feb 16, 2018 12:20 am
by Wajideus
A new feature, access modifiers for function arguments:

Code: Select all

proc doThing1() {
    Thing thing;
    doThing1(&thing);
}

proc doThing2(Thing *thing) {
    doThing3(thing);  // ERROR! private -> public
}

proc doThing3(pub Thing *thing) {
}
Local variables and function arguments are private by default, and a private symbol:
  • Cannot be casted to a public symbol
  • Cannot be assigned by reference to any public symbol, struct member, or symbol in a higher scope.
  • Cannot be passed as an argument by reference to any function that expects a public symbol
Return values are public by default, and a public symbol:
  • Can be casted to a private symbol
  • Can be assigned by reference to any other symbol or struct member
  • Can be passed as an argument by reference to any function
This mechanism makes it impossible to accidentally create dangling pointers.


EDIT:

Something else I've decided to support, non-nullable references:

Code: Select all

proc UpdateActor(Actor &actor) {
}
The way this works is simple. Any `T&` to `T*` cast is implicit and works exactly as you would expect. However, an implicit cast from `T*` to `T&` will result in a non-null-pointer assertion being generated. Doing an explicit cast from `T*` to `T&` will bypass this assertion, so you can do stuff like having an object at address `0`.

When it comes to struct members, all non-nullable references must be initialized before the struct can be passed to any function.


Edit #2:

Just to put things into perspective:

Code: Select all

proc OpenFile(str name) File* {
    return null;
}

proc run() {
    File &f = OpenFile("test.txt");    // ERROR!!!
}
The runtime could make this really obvious by printing something to stderr like, "error: ln 6, col 15: expected `File`, but got `null`".


Edit #3:

Because array lengths can be either constants or variables (unlike C/C++), they are always bounds checked. You can bypass bounds checking by casting the array to a pointer.

Code: Select all

proc run() {
    int len = 5;
    int val[len], *valp = val;
    val[5] = 0;    // ERROR! out of bounds
    valp[5] = 0;   // undefined behavior
}

Edit #4:

An idea for how to do error handling just hit me:

Code: Select all

proc OpenFile(str name) File* {
    pub thread_local var enum Error {
        None,
        NotSupported
    }
    Error = NotSupported;
    return null;
}

proc run() {
    File *file = OpenFile("test.txt");
    if (!file) {
        when (OpenFile.Error) {
            is NotSupported:
                puts("error: 'OpenFile' is not supported.\n");
        }
    }
}

Edit #5

Decided that guard statements would be a good idea, because you can use them prevent future null-pointer checks

Code: Select all

proc CloseFile(File &f);

proc run() {
    File &f;
    catch (f = OpenFile("test.txt")) {
        when (OpenFile.Error) {
            is NotSupported:
                puts("error: 'OpenFile' is not supported.\n");
        }
    }
    defer CloseFile(f);  // no null-pointer check :)
}

Re: Opinions On A New Programming Language

Posted: Thu Feb 22, 2018 1:16 am
by Wajideus
Something I've been pondering about today in regards to the type system. In C/C++, the types like "int", "short", and "long" aren't very well defined. A lot of code is written with the expectation that "int" is 32-bit. That is to say that while the languages permit you to write portable code, they made it exceptionally easy not to do so.

One of the ideas I've came up with for tackling this problem involves extending the bitfield syntax a little and taking advantage of module search paths. In it, bitfield sizes can optionally have "fastest" or "shortest" modifiers, or can have a size of "longest".

Code: Select all

const int32_t       = signed : 32;
const int_fast32_t  = signed : fastest 32;
const int_least32_t = signed : shortest 32;
const intmax_t      = signed : longest;
The standard library could have a sort of `stddef` module that's always imported by the compiler, and defines the common types like `int`, perhaps with a default of `signed : fastest 16`. If an application requires more bits than that, it can just define it's own local `stddef` module to override it. This makes it exceptionally easy to port an existing application to a device that has a smaller word size.


EDIT #1:

An alternative idea would be to have a built-in `integer` type which is unsized, and thus non-instantiable unless you explicitly specify the size or one of the `fastest`, `longest`, and `shortest` modifiers. The `int` and `s#` / `u#` types could then be weakly defined uninitialized constants.

Code: Select all

const int = fastest integer;

const int32_t        = integer : 32;
const int_fast32_t   = fastest integer : 32;
const int_least32_t  = shortest integer : 32;
const intmax_t       = longest integer;

const uint32_t        = unsigned integer : 32;
const uint_fast32_t   = fastest unsigned integer : 32;
const uint_least32_t  = shortest unsigned integer : 32;
const uintmax_t       = longest unsigned integer;
I actually like this idea a bit more than the previous one.

Re: Opinions On A New Programming Language

Posted: Mon Feb 26, 2018 2:35 pm
by Wajideus
So here's a new feature that will probably be in the language for sure, because I realized I want to do this thing all the time in C:

Code: Select all

struct TypeInfo {
    str name;
    int size;
}

use TypeInfo typeInfo[] = {
    bool_type: { "bool", sizeof(bool) },
    char_type: { "char", sizeof(char) },
    int_type: { "int", sizeof(int) },
    proto_type: { "proto", sizeof(proto) }
}
When you define an array, the values in the array can optionally have labels like an enum, so `typeInfo.int_type` is equivalent to `typeInfo[2]`. In the above example, I've also used the `use` keyword to raise the scope of the members of `typeInfo` into the global scope. this is the exact same concept as an anonymous enum.

I'm actually kind of curious if I can replace enums altogether, because aside from autoincrement, this is a strictly more powerful abstraction than enums.