[language design] My working notes on Thelema

Programming, for all ages and all languages.
User avatar
Schol-R-LEA
Member
Member
Posts: 1925
Joined: Fri Oct 27, 2006 9:42 am
Location: Athens, GA, USA

Re: [language design] My working notes on Thelema

Post by Schol-R-LEA »

OK, so now for the digression about hypertext, and specifically about Project Xanadu and xanalogical storage.

Project Xanadu, for those unfamiliar with it, was one of, if not the, first explorations of the concepts of hypertext and hypermedia, both terms coined by the originator of the project, Ted Nelson. The project informally began in 1960, taking an ever more solid shape throughout the 1960s and 1970s despite the skepticism, indifference, and even obstructionism of others, and after a half dozen or more iterations, is still ongoing today - you can see the latest project here. While Ted claims that it is finally a working system, after being the declared to be longest-running software vaporware ever by his critics and suffering as the butt of many industry jokes, though for all that there is now a light at the end of the tunnel, it has so far fallen far short of its intentions due to forces often out of his control.

It was a very different idea from the modern World Wide Web, even though one of Ted's books, Literary Machines, was a primary influence on Tim Berners-Lee and his design (though contrary to what Nelson has said, not the prime inspiration - Berners-Lee had been working on both SGML document formats, and other forms of hypertext, for at least three years before the book was published, and was originally only intending HTML and HTTP as a means to share research papers among scientists with something approximating citations).

Actually, much of the confusion about it is that it is a combination of many ideas, most of which would seem unrelated to most people. This is where both the strength, and the weakness, of Nelson's vision lies, in that people rarely see the connections he does - and connections are at the heart of his ideas.

And yes, Nelson (like myself) has ADHD. Quite severely, in fact. He see it as a strength, rather than a problem, but in terms of getting support for the project, and seeing it through to the end, it has been crippling, which is unfortunate, because it is also what led him to it in the first place. Perhaps more than anything, Xanadu was Ted's attempt to find a prosthesis with which to grapple with the breadth of his interests and ideas, a breadth borne out of his 'butterfly mind'.

Anyway, all of this is prologue. The point is that while Project Xanadu encompasses a wide number of ideas, many of which have since spread out into the computer field in separate pieces and in distorted forms, one piece that hasn't caught on is the idea that Files Are Evil.

OK, a bald statement like that, so typical of Ted, is going to take some explaining. I know, I know, I promise I will get to the point eventually, but digressions are a big part of all of this, and this probably won't be the last one here.

What Ted means when he talks about 'the tyranny of the file' is that the conventional, hierarchical model of files as separate entities, which need to be kept track of both by the file system and the user, is a poor fit for how the human mind actually works with information, and in particular, that it obscures the relationships between ideas. This applies to both conventional file handling, and to file-oriented hypertext/hypermedia systems like the World Wide Web.

It is here that Ted loses most people, because to most people, he is mixing up different levels of things - and Ted would even agree, but his views about what those levels are, is quite different from the one most people are familiar with. Basically, where most people see separate documents, which might refer to each other through citations or hyperlinks but are fundamentally separate, he sees swarms of ideas which can be organized in endless ways and viewed through many lenses, of which the 'documents' are just one possible view of them, and not an especially fundamental one at that.

Now, this will seems somewhat familiar to those of you who have some experience with relational databases, and in fact Ted took a look at RDBMS ideas in the late 1990s, concluding that they were on the right track, but still blinkered by their assumptions about what data 'really is'.

To his eyes, there is no 'really is'. He views information as a continuum -- what he calls a docuverse - and his primary frustration is in the fact that everyone else is (by his estimation) trying to impose their ideas of what the pieces of that continuum are, rather than them float free for anyone to view as they choose. He sees Xanadu as an attempt to approximate that free-floating continuum - he's is trying to reduce the amount of inherent structure in order to allow variant structures to be easier to find.

Getting these ideas across is really, really difficult, especially since (again, like myself) he often leaves the best parts in his own mind, making it look like he's jumping all over the place and skipping steps.

He does that, too, but most of that impression comes from things he has so well-set in his own mind that he forgets that other people haven't heard them yet. This is a trap that is far too easy for an visionary to fall into, and while he is aware of the problem and does strive to avoid it, it is one which is hard to notice for anyone until it is brought to there attention - and sadly, few have had the patience to do so.

Moving on to the next post, which discusses the back-end and front-end part of Xanadu, which I need to gloss a bit before explaining how this all ties into my language ideas.
Last edited by Schol-R-LEA on Tue Nov 28, 2017 2:05 pm, edited 8 times in total.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
User avatar
Schol-R-LEA
Member
Member
Posts: 1925
Joined: Fri Oct 27, 2006 9:42 am
Location: Athens, GA, USA

Re: [language design] My working notes on Thelema

Post by Schol-R-LEA »

OK, let's go over how Xanadu is intended to work, and how I intend to apply the ideas, if not always the methods, thereof.

As I already said, the point of Xanadu is to replace files. Much of what Nelson & Co. are doing can be done with existing file systems, and in fact several of the iterations of the project ran on top of existing file systems or RDBMS systems, but doing them with existing files involves a lot of ad-hoc work. Xanadu, at least in the file-system based forms, is essentially a library for doing just that, though part of how it does that is to create what amounts to a separate database system on top of the existing ones - which is why the real eventual goal was to implement it in a more stand-alone form, working on the media directly rather than through the file system.

In the 1988 and 1993 designs of Xanadu (and presumably in OpenXanadu, though despite the name not all the code is exposed yet AFAICT), there is supposed a Back End and a Front End, with the Front-End/Back End (FEBE) protocol between them. The BE handles managing the storage and retrieval of data fragments, both locally and remotely, keeps track of what is where and which things are stored or mirrored locally, and maintaining coherence across distributed storage and caching. It is a hairy piece of work and while its operations are secondary to the goals of Xanadu, it is at the heart of the implementation of those goals.

My understanding is that the FE handles the decisions about which fragments to request at a given time. Note that the FE is a front end to the applications, not some equivalent of a browser - it is primarily an API for the BE, though it does do some management of the data views as well.

Presentation to the user is up to the applications themselves, and to the display manager, which is supposed to permit various ways of display connections between applications to the user. At this point, you can probably see part of what most people get confused by in all of this, as there is no single 'browser' anywhere in all of this.

Basically, when a new datum is created - whether it is a 'text document', a 'spreadsheet document', a 'saved game record', an 'image', or what have you - the FE passes it to the BE, which it encrypts in some way and then writes to some storage media - possibly as part of a journal that contains data from several other applications and users.

Along with the data, the FE passes the BE information about the datum and it source. If it is part of a larger 'document' - which is usually going to be the case - it includes information about the document, including a link to the address of any related data, and how they are related. For example, for a 'word processing' application, it might pass a link to the datum which was, at the time of editing, the immediate predecessor of the datum being stored, and that the datum was (again, when created) the successive item in a larger document.

The BE catalogs each of the recently written data according to the format the datum is in, its size, the user who created it, local date and time of creation, the application it originated in, the encryption type - all things that a conventional file system may or may not record - but also the publication status (and later, publication history), the current (and later, previous) owners/maintainers of the datum, the current location it is stored in, and whether to mirror it elsewhere (which is the default for most things).

Up to this point, it looks normal. Here is where it changes course a bit.

The BE generates a permanent address link for the datum, one which is independent of its current location in storage. This is a key point, because the storage location itself is only an ephemeris to the system - while the datum is meant to be treated as immutable, and the parent copy should never be overwritten, the actual physical image of the datum in the storage medium isn't the datum itself. This is also why, for networked systems, automatic mirroring is the default (and why it being encrypted - and the fact that the encryption methods can vary from datum to datum or even copy to copy of the same datum - is important).

A large part of this is to abstract away, from the perspective of the FE, the applications, and the user, the process of storing, transmitting, mirroring, and caching the data. As far as everything outside of the BE is concerned, the datum is (or should be) immutable and eternal, approximating a Platonic essence of the idea it encodes. The reality is obviously more complicated, but the system is meant to bend over backwards to maintain that illusion, across the entire 'docuverse' straddling the network.

(So far, it hasn't quite managed this, and perhaps never will, but in terms of its goals, it goes further than any other system that I know of.)

Now, you may have noted that I haven't talked about links, hyper or otherwise, yet. This is where things go even further out of the norm, because the Xanadu idea of a 'hyperlink' has nothing much at all to do with the hyperlinks of things like the WWW.

In Xanadu, there are several types of links, most of which are not directly related to how the datum is presented to the user. The particular kind of links in consideration right now might be called 'resolution links', which describe the physical location(s) of the data; and 'association links', which store how two or more data relate to each other (these aren't the terms used by the Project, but explaining their terms would take hundreds of pages, and I only know a fraction of the terminology myself). The former are ephemeral, relating to the specific physical storage, and are stored as the equivalent of a FAT or an i-node structure, while the latter are permanent, and have their own resolution links when they are themselves stored.

Some of the types of association links are:
  • 'span links', which refers to a slice or section out of the datum, allowing just the relevant sections to be referenced in documents or transferred across a network, without having to serve the whole document - the 'whole document' is itself just a series of different kinds of association links.
  • 'permutation-of-order links', which are used to manipulate the structure of the document, creating a view - or collection of views - which can themselves be stored and manipulated. This relates to the immutability of data - rather than changing the data when updating, the FE permutes the order of the links that make up the 'document' or view, and pass that permutation to the BE to record it. This, among other things, serves as both persistent undo/redo, and as version control.
  • 'structuring links', which describe the layout of the view independent of the data and the ordering thereof. This acts as out-of-band markup, among other things - the markup is not part of the datum itself.
  • 'citation links', which represent a place where a user wanted to record a connection between two ideas. This link associates bi-directionally, and has its own separate publication and visibility which is partially dependent of that of the data - an application, and hence a user, can view any citation link IIF they have permission to view both the citation and all of the data it refers to. There many also be 'meta-citations' which aggregate several related citations, but I don't know if that was something actually planned or just something discussed - since citations are themselves data, and all data are first-class citizens, such a meta-citation would just be a specific case of a view.
It is important to recall that the 'views' in question are to the applications and the display manager. They can then organize the actual user display based on those views into the data as needed. The same data - or even the same views - may be shown as part of a 'text document' by one application, as set of spreadsheet cells by another, or composed with some image in yet another. This is why markup is out-of-band, and why structuring links applied to a given set of data are stored for later use by the applications.

There are still other links for recording the history of the datum's ownership and publication status, connecting a data format to one or more means of interpreting the format or transposing it into another format, indirection (to allow for updating of views - since most links available to the FE are immutable, these allow for the equivalent of a VCS repo's 'HEAD' branch, allowing the applications to fetch whatever the latest version of a document is and separating 'currently published' from 'previously published'), tracking where copies of a given datum can be found for the purposes of caching and Torrent-like network distribution, and so forth, but most of those are only for use internally by the BE.

When the new datum is created as part of some new document, a new association link is created to connect it to that document, which is then passed back to the FE for use by the application. The FE then creates a permutation link for the document, incorporating the datum into the document link traces, which is then passed back to the BE for storage.

Moving on...
Last edited by Schol-R-LEA on Tue Nov 28, 2017 2:26 pm, edited 2 times in total.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
User avatar
Schol-R-LEA
Member
Member
Posts: 1925
Joined: Fri Oct 27, 2006 9:42 am
Location: Athens, GA, USA

Re: [language design] My working notes on Thelema

Post by Schol-R-LEA »

OK, so I've covered all this stuff about how xanalogical storage is meant to work, so I can pop that off the stack and talk about the plans for my languages, or more particularly, for my compiler and toolchain.

Basically, my plan is to have an editor that performs the lexical analysis on the fly, and saves the programs not as text, but as a link trace of meta-tokens. The lexical analyzer would still be available as a separate tool, which the editor would be calling as a library, so the compiler could potentially use source code from other editors, but that's going in a different direction than I have in mind.

What is a meta-token, you ask? Well, in part it is a token in the lexical analysis sense - a lexeme which the syntax analyzer can operate on. The 'meta' part comes from the fact that the datum it references does not need to be a specific text string - it can, in principle at least, be any kind of data at all, provided that the syntax analyzer can interpret it in a meaningful way.

Also, a meta-token may be associated with more than one value, allowing for alternate representations of the syntactic structure - provided that the syntax analyzer agrees that the different representations have the same meaning. So what the meta-token really is is a way of associating a representation with a syntactic meaning that was set by the syntax analyzer when the meta-token was created.

I expect that you can see why I am talking of a 'syntax analyzer' rather than 'parser'.

Now, this does complicate the editing a bit - there has to be a way to differentiate between 'change the name/representation of this particular variable globally' and 'change from this variable to a new one or a different one, just for this particular part of the program', among other things. But it also opens up a lot of possibilities that would be a lot less feasible with the conventional 'plain text' model of source code.

For example, if the editor treats some parts of the program structure as 'markup' rather than 'syntax' - for example, indentation, newlines, delimiters for things like string literals or the beginning and ending of lexical blocks - then the same code could be edited in multiple 'programming languages' without needing an explicit translator - the program is stored as a syntax tree of meta-tokens anyway, so the representation of the program is separate from the 'Platonic essence' of the program the code describes. The source code itself is just a specific presentation of the program.

Mind you, it would still be in 'the same' language in the sense that the actual syntax would be the same, just shown in different ways, so it wouldn't quite be all things to all programmers, but it would make things a lot more flexible. And if two analyzers for different language syntaces had some or all of the possible meta-tokens in common, it drastically changes certain code-level interop issues.

It also solves some of the dichotomy between conventional programming languages and 'visual' ones, though the problem of the Deutsch Limit would still exist for any given visual presentation.

On a side note, and as a preview of where I am going with this, the lexer could also add citation links to make it an annotated AST, adding to the ability to pass information about the program to the semantic analyzer and code generator. This can allow for additional analysis of things like, say, whether certain optimizations could be applied in the generated executable.

Oh, and because the final executable is also stored xanalogically, there is no reason it has to produce a single executable image - it can create multiple whole executable images for different architectures, branched executables for variants of the same architecture and system hardware which the loader could select from, even 'templates' which the loader could fill in the gaps to at load time - the loader would only need to fetch those parts it needed, possibly along with additional information it could use to further tweak the executable by means of runtime code synthesis (Surprise! you knew I was going to bring that up somewhere in all of this, didn't you?). Oh, and the executables would be cached on systems other than the origin, and only permanently stored if the user or an application chooses to mirror it, so updating and backtracking isn't especially difficult (which is also a reason why everything transferred between systems is encrypted), and any node currently mirroring or caching something can be used by the other nodes as the equivalent of a Torrent site for published programs if the administrators chose to allow it, according to the limits they choose (but only to users who have rights to use them - I am sure that there would be a way around that, and it raises some hairy issues about licensing, regulation, and compliance, but that would have to be dealt with after the experimental stage of all this).

Just one more post, I promise. I am finally ready to explain how all of this ties into types and dispatch.
Last edited by Schol-R-LEA on Tue Nov 28, 2017 2:44 pm, edited 17 times in total.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
User avatar
Schol-R-LEA
Member
Member
Posts: 1925
Joined: Fri Oct 27, 2006 9:42 am
Location: Athens, GA, USA

Re: [language design] My working notes on Thelema

Post by Schol-R-LEA »

OK, some of you can probably see where this is going, but some of you are probably completely lost (some may be both, in varying ways). Don't worry, this is the payoff for all of that.

Aside from the whole 'abstract syntax tree of meta-tokens' form for programs, the xanalogical approach opens up another possible avenue for programming language design. Remember how I said that the use of permutation links rather than changing the stored values allowed for nearly-unlimited undo/redo, and (here' the key part) acted as a form of version control? And remember what I said about indirection links allowing for updatable views? Here's the important part: you can have multiple indirection links to different parts of the development history.

This gives you things like branching, forking, and staging, practically for free, once you have xanalogical storage.

OK, so getting to there is anything but free, but bear with me here.

If the compiler is working from an indirection link to the stored AST, and the AST itself is mostly just a tree of links to the meta-tokens, then the compiler can keep a separate record of which warnings, constraints, and optimizations to apply when compiling the program, and link that to the indirection handle.

Back to the compiler annotations. Did you notice that these can - once again - be anything that the compiler might have a use for? It can serve to link to code documentation, design documents, UML diagrams, whatever. And if it has some hooks that allow it to, say, apply a constraint based on the documentation - a reminder to update the documentation, say, or some kind of constraint based on a class declaration matching the structure defined in UML - then it could use that to change the errors, executable output, or other results.

Or, just perhaps, it could be used to apply type constraints on code which doesn't explicitly declare types.

Now, I would still want to be able to add explicit typing to the program source code, especially for things like procedural dispatch (where you need it in order to have the program call the right procedure), but, if we can have it represented as a form of compiler constraint, well, there's no reason that the code editor can't hoist them out and save them as annotations, right?

That would let you, say, write most of the code without worrying about typing when you are first working out things, then progressively add more stringent constraints as you stage from (for example) 'development-experimental', to 'development', to 'unit testing', to 'integration', and so forth up to 'release'.

And the editor and compiler together could be configured to enforce that you can only edit the program code in either 'development' or 'development-experimental', while still permitting you to add type predicates later on. Oh, it couldn't stop you from creating a different permutation in some other application, but it could simply refuse to work with that alternate permutation.

So, now you know what I have in mind. Will it work? I have no idea; probably not, if I am really honest about it. But I should learn a lot about what does and doesn't work along the way, right?
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Wajideus
Member
Member
Posts: 47
Joined: Thu Nov 16, 2017 3:01 pm
Libera.chat IRC: wajideus

Re: [language design] My working notes on Thelema

Post by Wajideus »

Some of the types of association links are:

'span links', which refers to a slice or section out of the datum, allowing just the relevant sections to be referenced in documents or transferred across a network, without having to serve the whole document - the 'whole document' is itself just a series of different kinds of association links.
'permutation-of-order links', which are used to manipulate the structure of the document, creating a view - or collection of views - which can themselves be stored and manipulated. This relates to the immutability of data - rather than changing the data when updating, the FE permutes the order of the links that make up the 'document' or view, and pass that permutation to the BE to record it. This, among other things, serves as both persistent undo/redo, and as version control.
'structuring links', which describe the layout of the view independent of the data and the ordering thereof. This acts as out-of-band markup, among other things - the markup is not part of the datum itself.
'citation links', which represent a place where a user wanted to record a connection between two ideas. This link associates bi-directionally, and has its own separate publication and visibility which is partially dependent of that of the data - an application, and hence a user, can view any citation link IIF they have permission to view both the citation and all of the data it refers to. There many also be 'meta-citations' which aggregate several related citations, but I don't know if that was something actually planned or just something discussed - since citations are themselves data, and all data are first-class citizens, such a meta-citation would just be a specific case of a view.
The first 3 things here sound like filter / map / reduce / sort operations.

I do agree that information is more of a continuum than anything. You could probably even build a traditional filesystem on top of this sort of thing using citation links.

Basically, my plan is to have an editor that performs the lexical analysis on the fly, and saves the programs not as text, but as a link trace of meta-tokens. The lexical analyzer would still be available as a separate tool, which the editor would be calling as a library, so the compiler could potentially use source code from other editors, but that's going in a different direction than I have in mind.

What is a meta-token, you ask? Well, in part it is a token in the lexical analysis sense - a lexeme which the syntax analyzer can operate on.
This actually reminds me of when I programmed in TI-BASIC. The editor doesn't operate on characters like a normal text editor. Things like "Asm(", "Input", "Disp", etc. were tokens selectable from a menu. My guess is that they did this so that they could skip lexical analysis.
Remember how I said that the use of permutation links rather than changing the stored values allowed for nearly-unlimited undo/redo, and (here' the key part) acted as a form of version control? And remember what I said about indirection links allowing for updatable views? Here's the important part: you can have multiple indirection links to different parts of the development history.
"Permutation Links" sounds like something I did in a configuration format I designed a while back for a game engine:

Code: Select all

Class Instance-Variant {
    Class Field: Value;
}
Where you can use an asterisk for the classname to reference an instance (rather than define a new one), classnames are optional, and the hyphen / variant part is optional. A practical use would be something like:

Code: Select all

*MyWindow {
    Text: "My Window";
    Size: 320px, 240px;
}

*MyWindow-Linux {
    Text: "My Window For Linux";
}

*MyWindow-Linux-GNU {
    Text: "My Window For Pedantic People";
}
I basically designed it because I wanted to have a flexible configuration and object definition format for the resource compiler that supported fallback chains.
User avatar
Schol-R-LEA
Member
Member
Posts: 1925
Joined: Fri Oct 27, 2006 9:42 am
Location: Athens, GA, USA

Re: [language design] My working notes on Thelema

Post by Schol-R-LEA »

So, matters of state... no, not that sort, ones relating to program state, and manipulating the state of the data.

At the lowest level, both data and code are just, well... signals. Electrical impulses. Everything we do in programming is about changing those impulses, using some of the impulses to direct the changes in some of the others. The state of those impulses - in both the CPU and the memory, as well as in all the peripherals - at a given CPU clock cycle is inherently global, but we need to be able to separate them conceptually, if for no other reason than that it would be impossible for the human brain to keep track of it all.

To this end, we create abstractions, descriptions of what we want to electrical state to represent. These do not exist in the machine, as such, but in how we interpret the state transformations. It is for the sake of these abstractions - representations of data such as numbers, letters, and so forth, and representations of transformations applied to them - that we created them hardware in the first place.

Setting aside (for now) the abstractions of system state into things such as processes, divisions such as protected/kernel space, user spaces, executable versus non-executable regions, and so forth, we can consider how we manage state in different programming models.

'Machine code' abstracts the code and data as numerical values; it is still an abstraction, but one that very directly represents the state (at least within the given process, if such higher-order abstractions are in use). The difference of one from the other is entirely held in the mind of the programmer, and can, with sufficient cleverness, be overlapped in various ways. However, this is an excessively demanding and error prone method, just from the mental burden alone, and was dropped rapidly by all but the most die-hard 'real programmers'.

Assembly language is mostly just a more comprehensible representation of the same code structure as machine code, but it does introduce some new abstractions, most notably the mnemonics themselves, which are more memorable than individual numeric values (as well as permitting the assembler to do the hard work of computing the instruction fields, something often overlooked by those who don't know how frustrating working those out in hex or octal can be), and labels, which abstract the locations of different sections of code and data as something to be computed by the software.

While additional abstractions, in the form of equate directives, macros, and so forth (some of which, such as stacks and subroutines, would later be assisted by the design of the hardware) are also common features of assemblers, the fundamental abstraction is really the label: it not only removes the need to compute addresses, it allows the separation of code and data as part of the program, rather than just the programmer's mental model.

From there we go to higher level languages in great variety, and with many different models of what the program 'really is' conceptually (as opposed to what it 'really is' in the system state). Most of these paradigms are really all just different approaches to managing different aspects of how the program state changes over time, especially the state of the data.

Iterative programming is just the same sort of global (with regards to the given process) view common in assembly language; while languages such as the original FORTRAN and COBOL it may have some ways of separating some parts of the program as subroutines, this was viewed as a way to reduce repetition and the memory footprint.

It was only with procedural programming, starting with LISP 1.0 and Algol-60, that subroutines started to be seen as a way of structuring programs for the sake of abstraction rather than just as a pragmatic solution to limited system memory. Even in these languages, the idea took time to evolve, as did the idea of designing flow of control so as to reduce programmer confusion (which is really what was at the heart of the 'structured programming'/'goto-less programming' movement, something that was heavily debated in the 1960s and remained an occasional topic for cranks to rail over even today).

As procedural programming gained ground, the idea of explicitly structuring data (which had previously only existed in COBOL, or all things) by grouping items into 'structures' or 'records' began to appear as well. Some languages arose which focused on a specific organizing principle such as linked lists (Lisp), strings (SNOBOL), or arrays (APL), while other such as PL/1 tried to provide a vast number of different pre-defined data types, but the mainstream languages mostly followed the lead of Niklaus Wirth with the idea of defining a series of adjacent variable names for different 'fields' of the records. Despite this, the use of data structures lagged for years, and it remained common into the 1980s to see implicit 'data structures' composed of independent variables which were associated solely by their intended use, or possibly by some naming convention (and you still see this sometimes today - I'm looking at you, ~, and your so-called C compiler).

To this point, the focus had mostly been on partitioning state temporally, in terms of the sequence of operations. This began to change as three new approaches arose focusing on limiting the scope of state changes, so as to reduce the potential for 'side effects' - changes in state aside from those explicit to a given operation. All of these began in the late 1960s and early 1970s, but for various reasons, are often the cause of considerable confusion even now.

The first of these approaches was abstraction through isolation, which was done by tying operations on data to the data types being operated on. This was variously named 'abstract data types', 'actors', or most commonly today, 'object-oriented programming'; while each of those terms did reflect a somewhat different view of what was being done, they all involved restricting access to the data to a limited set of operations. The goal was to treat data structures as unitary objects, and only allow the operations defined as part of the type (or class) to alter the data directly. This proved useful for problems involving complex, but largely static, data types, where the structure of the data was well known ahead of time.

While object-oriented programming arose from a particular approach that is largely forgotten today (that of treating the data objects as physical ones), by adding 'inheritance' to abstract types, it furthered this abstraction by allowing categories of data structures to be defined sharing some properties and operations, which allowed sets of common interfaces to evolve. This made it very effective for dealing with groups of related data structures which had similar, but varying, data fields or operations, putting the focus on the conceptual organization rather than the layout of individual types. It was useful for certain classes of problems - particularly simulations, process abstractions (in the Actor model) and, most significantly of all, graphical systems such as GUIs and games.

The second was approach was to abstract by means of immutability, for what is now called applicative programming. This focuses on treating the data as immutable, and where anything that would alter the state of the program would now instead create a new data structure with the desired state instead; conceptually, at least, the existing data structure would exist in perpetuity, effectively stateless - only the process itself would have state. The later functional programming took this further still, by creating the illusion that even the process itself was stateless; the results of a computation would be treated as something that had always been there, and the application of the function only revealed it, rather than computing it.

This proved to be very useful in simplifying the conceptual workings of programs, especially (ironically enough) those which need to build fluid, evolving data structures whose components could not be determined statically prior to the process running. This advantage was counterbalanced by the problems that arose when trying to work with dynamic data input, such as in just about any kind of user interaction, and while solutions were developed, they proved difficult to explain to programmers not already immersed in functional methods.

Finally, there is the method focusing on defining a set of goals rather than a set of operations. This declarative approach took three main forms: defining logical assertions, defining relationships between collections of data structures, and defining a series of constraints to be fulfilled. The most successful of these was the Relational approach, though its association with 'databases', and specifically the tabular format often used for representing them, has led to it being widely misunderstood even by the majority of its practitioners. This confusion stems in part from the relative isolation from the rest of the programming which RBDMSes evolved, but mostly due to the faults and limitations of a particularly wretched version of it being enshrined in the SQL standard - a standard so ill-conceived that calling it 'half-assed' would be a mortal insult to buttocks the world over.

I'm not sure where I am going with this at this point.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Post Reply