You provide malloc and friends (for mutable strings). The standard library provides the rest of the support for the string type.Hobbes wrote:I like the idea of a native string datatype, but wonder how it may be implemented without run-time support. The compiler could inline (the equivalents of) strcpy and strcmp but that would be very space-inefficient.
What features should a systems programming language have?
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: What features should a systems programming language have
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
Re: What features should a systems programming language have
The problem with many of the structural features is that they need heap support - i.e. a malloc/free pair. You can't sanely pass strings over the stack, and the same goes for continuations and passing datatypes by value.I like the idea of a native string datatype, but wonder how it may be implemented without run-time support.
From what I've seen with my FreeBasic porting efforts where runtime involves a significant portion but occasionally has unwanted dependencies, you will probably want to cut the language in three parts:
1: The largest subset of the language that needs no runtime except for things like a functional stack and known processor state. This should be enough to implement...
2: The largest subset of the language that can be compiled provided a functional implementation of malloc/free and nothing else (read: a functional heap). You need this to support all non-linear data flows in the language semantics, as well as dynamic array types and native string types.
3: The entire language, including all features that depend on a hosted environment.
C is so simplistic that there's no part 2. C++ does, but yet you don't have a properly defined mechanic to limit the language subset to the items you can use, and instead you have to manually disable individual language features. For a proper systems development language these separations should be defined and enforceable by the compiler. FreeBasic is troublesome here since it makes a bit of a mess here to include references to hosted features (diagnostics and such) in runtime functions where you wouldn't expect anything but a heap allocation. Plus, you don't want to stick with the choice of having to implement dummies for pretty much everything (and not be ticked off by the compiler in advance) to get the libc working.
Interfacing with assembly is not inline assembly. Besides, since it is platform-dependent, it's very hard to specify it as part of the language standard.There was an interesting discussion on forum.dlang.org and no definitive conclusion if inline assembler in D were a good idea.
Just because it's possible doesn't make it a good idea. Having a GC takes away your control over memory and puts it in a black box. If you wrote the thing, you know how the black box works and you can use it to force allocation semantics when you need them, but it'll make it hard to maintain for everyone that doesn't know how the system works. Also, if the garbage collector gets changed, memory allocation semantics change as well with all the effects thereof.The operating systems developed at ETH Zürich have been written in languages with GC since the 80's and were actually in daily use until this millennium. So I would not outright exclude a GC.
Full-thread continuations are harder and in some contexts the wrong solution, but the core idea involves that instead of just being able to pass a function pointer, you pass a code reference with arbitrary amounts of auxiliary data. The typical C construct for doing anything close to this is a tuple of ( type(*)(void*), void* ) which allows you to pass a function and data separately, and toss the data as the argument of the function. It works but it's not typesafe, the called function has no idea on how to manage the memory in the data side, and you have to write a struct, as well as the marshalling and demarshalling explicitly when the compiler can easily automate it for you. Doing this for entire threads and creating an continuation is impossible in C without doing platform-specific tricks. C does allow you to pull off its little brother the closure in its verbose and unsafe form, although there are fixes for that (example). Both closures and continuations are extremely powerful mechanics if made accessible.Could you please elaborate what you mean with continuations,
Re: What features should a systems programming language have
I'd actually prefer the language to be somewhat seperated into two parts, namely a low level part and a high level part.
For the low level part I'd like;
For the low level part I'd like;
- 1. Link-ability with assembly language
- * no inline assembly
- * type system units used for distinction between physical and virtual addresses.
- * no reordering
* no padding
* no funny buisness
- 1. Logical datastructures (No focus on physical layout)
- * reordering
* padding
* splitting data structures
* other optimization
- * No garbage collection!
- * Generics
- * Destructors
6. Support for other paradigms than OOP!- * Generic and functional
- * Lambdas, Tuples, ect.
- * Memory allocation
* Lite stack unrolling
- * Exceptions, ect.
- * reordering
- 1. A strong type system, with a reasonably low amount of type annotations
- * No mixing of numerical and boolean types
* Minimal implicit casting
* Heavy static analysis
- * No weird inbetween structures, like bitfields
- * No mixing of numerical and boolean types
// Skeen
// Developing a yet unnamed microkernel in C++14.
// Developing a yet unnamed microkernel in C++14.
Re: What features should a systems programming language have
There's no reason why a kernel implemented in a language with automatic memory management needs to impose such a scheme on the userland applications. So why specifically is a kernel implemented in a language with GC inefficient for a userland implemented without GC?Owen wrote:Also: In the general case I'd exclude languages with a large GC emphasis from the general systems programming language domain, because they have a high impedance with implementing, say, a kernel for a non-GC'd userland efficiently
Last edited by bwat on Wed Jan 29, 2014 10:12 am, edited 1 time in total.
Every universe of discourse has its logical structure --- S. K. Langer.
Re: What features should a systems programming language have
Programmers who have worked at TI, Xerox, Symbolics, and Tektronix would say otherwise.Combuster wrote:2: ... because GC is not done in system languages)
Continuations and closures are two very different things. A continuation is a computational context, e.g., a snapshot containing the evaluation, environment and control stacks, whereas a closure is a piece of code/expression/formula that has no unbound variables.Combuster wrote: creating an continuation is impossible in C without doing platform-specific tricks. C does allow you to pull off its little brother the closure
Some languages represent continuations with closures but that is just how they chose to reify them. You could easily represent a continuation with a value of a new data-type which must be passed to a specific continue function along with any values that are to be returned to the computation that is continued.
Every universe of discourse has its logical structure --- S. K. Langer.
Re: What features should a systems programming language have
But reference counting is garbage collection (automatic memory management).skeen wrote: 2. Automatic reference counting for memory management
- * No garbage collection!
Every universe of discourse has its logical structure --- S. K. Langer.
Re: What features should a systems programming language have
What mechanism to handle errors do you prefer?Run-time error handling needs to be explicit too. For example, if an application asks to spawn a thread and your code tries to allocate memory for a "thread control block" but can't because you've run out of memory, then you want to return an error back to the application. You don't want the entire OS to crash because some idiot thought "new" was a good idea.
Re: What features should a systems programming language have
Thanks, I was just going to write thisskeen wrote:
2. Automatic reference counting for memory management
* No garbage collection!
But reference counting is garbage collection (automatic memory management).
Acutally I'm thinking that a language needs a good way to keep track of the lifetime of resources in general, where memory is only one of. So, even if you have a GC and make heavy use of it, you still want to have the machinery for keeping track of everything else.
So the language should not depend on the GC, but some features might, for example: returning closures from a function, if those are allocated on some kind of global storage (e.g. heap).
Last edited by HoTT on Wed Jan 29, 2014 10:51 am, edited 1 time in total.
-
- Member
- Posts: 595
- Joined: Mon Jul 05, 2010 4:15 pm
Re: What features should a systems programming language have
When programming an OS you often get into some kind of garbage collection using reference counting if your kernel is supposed to work in an SMP system. However, a generic garbage collector that you'd find in a user mode library is often not fit for kernels. I have a reference counted garbage collection in my kernel but it is really a specialized piece of code. Different objects are treated differently and different thresholds when to run the garbage collector which is something you usually don't find with generic user mode GC. Because of this I don't really find any use for garbage collected languages for kernel programming. GC is great in user mode but in kernel GC often needs to be specialized.
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: What features should a systems programming language have
Garbage collection performance is traded off with memory usage. Current state of the art garbage collectors match performance with manual memory management around 3x the memory usage (in the "active" working set), so a GC'd kernel is going to use 3x as much memory or be slower (And even at 3x memory usage, there are consistency issues - the amortized performance might be the same, but thats little consolation if a GC cycle causes your whole system to pause for a while). Concurrent collectors make bigger memory - performance trade offs.bwat wrote:There's no reason why a kernel implemented in a language with automatic memory management needs to impose such a scheme on the userland applications. So why specifically is a kernel implemented in a language with GC inefficient for a userland implemented without GC?Owen wrote:Also: In the general case I'd exclude languages with a large GC emphasis from the general systems programming language domain, because they have a high impedance with implementing, say, a kernel for a non-GC'd userland efficiently
A GC'd kernel is either going to be less performant or use more memory (and likely that memory is going to be non-swapable). If you're developing a managed OS, you can devise solutions to amortize this cost (Because its' all on one heap, you need less indirection for a start, so can reduce general memory consumption to compensate)
Re: What features should a systems programming language have
I'd like to say that having a garbage collector does not imply it is used for every memory allocation. On the other hand if you have a GC that uses a RC scheme language support can eliminate many accesses to the reference counts.A GC'd kernel is either going to be less performant or use more memory (and likely that memory is going to be non-swapable). If you're developing a managed OS, you can devise solutions to amortize this cost (Because its' all on one heap, you need less indirection for a start, so can reduce general memory consumption to compensate)
Re: What features should a systems programming language have
I agree.Owen wrote:Garbage collection performance is traded off with memory usage.
http://www.cs.princeton.edu/~appel/papers/45.ps
I implemented Cheney's algorithm in my Scheme compiler that is faster than straight malloc without free (extreme opposite of managed memory). That would be 2x in your terms.Owen wrote: Current state of the art garbage collectors match performance with manual memory management around 3x the memory usage (in the "active" working set), so a GC'd kernel is going to use 3x as much memory or be slower
Can you define consistency here?Owen wrote: (And even at 3x memory usage, there are consistency issues - the amortized performance might be the same,
Is "performant" a word? And if so, does it not rely on some specification you've not revealed (w.r.t. acceptable pause times, collection times, acceptable heap sizes, available memory etc.).Owen wrote:
but thats little consolation if a GC cycle causes your whole system to pause for a while). A GC'd kernel is either going to be less performant or use more memory (and likely that memory is going to be non-swapable).
Every universe of discourse has its logical structure --- S. K. Langer.
Re: What features should a systems programming language have
According to this book reference counting is a form of garbage collection. However we should just agree on a commonCombuster wrote: bwat wrote:
skeen wrote:
2. Automatic reference counting for memory management
* No garbage collection!
But reference counting is garbage collection (automatic memory management).
wikipedia wrote:
In computer science, garbage collection (GC) is a form of automatic memory management.
I call a troll.
terminology and continue the discussion.
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: What features should a systems programming language have
The allocation is certainly faster (add ptr, size; cmp ptr, size_of_heap; if_greater call compact)bwat wrote:I agree.Owen wrote:Garbage collection performance is traded off with memory usage.
http://www.cs.princeton.edu/~appel/papers/45.ps
I implemented Cheney's algorithm in my Scheme compiler that is faster than straight malloc without free (extreme opposite of managed memory). That would be 2x in your terms.Owen wrote: Current state of the art garbage collectors match performance with manual memory management around 3x the memory usage (in the "active" working set), so a GC'd kernel is going to use 3x as much memory or be slower
However, how fast is garbage collection? It requires scanning a sizable portion of the heap. The amortized cost of (allocate + collect) is probably more than the equivalent (malloc + free), especially as in non-GC languages stack object references are often passed around which in GC languages must be on the heap
All memory allocators need to do some book keeping. State of the art mallocs largely have ~constant book keeping overhead per call, with occasional small spikes. GCs are spikier; allocation is normally really cheap, but occasionally it decides it needs to collect and spikes somewhatbwat wrote:Can you define consistency here?Owen wrote: (And even at 3x memory usage, there are consistency issues - the amortized performance might be the same,
That depends upon your system, but: if collection ever takes >1ms, that's probably too much for precise timing (quite possibly significantly too much, looking more towards ~200μS) dependent apps. What this means is your allocator must be per-emptible, which means that, for example, your scheduler can't use it*. This places constraints on use of a number of features of garbage collected languagesbwat wrote:Is "performant" a word? And if so, does it not rely on some specification you've not revealed (w.r.t. acceptable pause times, collection times, acceptable heap sizes, available memory etc.).Owen wrote: but thats little consolation if a GC cycle causes your whole system to pause for a while). A GC'd kernel is either going to be less performant or use more memory (and likely that memory is going to be non-swapable).
* It does at least have the advantage over non-GCd kernels that even if it can't malloc, it can free by beauty of that being just dropping the object on the floor
Re: What features should a systems programming language have
The bigger the heap the less often the scan. My GC alloc routine including collection was faster than Linux malloc. My alloc routine in my Scheme runtime isOwen wrote: However, how fast is garbage collection? It requires scanning a sizable portion of the heap. The amortized cost of (allocate + collect) is probably more than the equivalent (malloc + free), especially as in non-GC languages stack object references are often passed around which in GC languages must be on the heap
Code: Select all
void * GC_alloc (unsigned int size)
{
void * addr;
#ifdef GARBAGE_COLLECTION_TEST
addr = malloc(size);
#else
if(GC_free + size >= GC_tospace + GC_space_size)
{
GC_flip();
if(GC_free + size >= GC_tospace + GC_space_size)
{
fprintf(stderr, "Out of memory! - tried to allocate %u bytes\n", size);
GC_dump();
exit(EXIT_FAILURE);
}
}
addr = (void *)GC_free;
GC_free += size;
#endif
return addr;
}
real 0m2.810s
user 0m2.736s
sys 0m0.044s
Times for auto-compilation of the cold compiler with malloc (#define GARBAGE_COLLECTION_TEST)
real 0m5.313s
user 0m4.828s
sys 0m0.444s
With GC, the heap was flipped (call of GC_flip which is the mark, scan and copy routine) 15 times. Heap size is 10000000 bytes which is roughly 9.5 megs (1 meg is 1024*1024 bytes for me).
Yep, predictable or precise timing would need some work and easily be not worth it, I agree. I've seen real-time Lisp processes not generate garbage to avoid collection, and systems where each interrupt service routine had its own GC'd heap.Owen wrote: That depends upon your system, but: if collection ever takes >1ms, that's probably too much for precise timing (quite possibly significantly too much, looking more towards ~200μS) dependent apps. What this means is your allocator must be per-emptible, which means that, for example, your scheduler can't use it*. This places constraints on use of a number of features of garbage collected languages
Every universe of discourse has its logical structure --- S. K. Langer.