Page 1 of 3

Structure editors

Posted: Fri Jul 31, 2015 12:50 pm
by AndrewAPrice
This is probably nothing new to many of you, but at the moment I'm fascinated by the idea of a structured editor:



Let's say you have a grammar specification (like this one for Java), upon opening a file you can represent it as a syntax tree in the memory of the editor, by typing you can edit the syntax tree directly - but you're actually editing the nodes of tree rather than text although the tree is pretty printed on the screen to look like a text editor.

Pros:
- Syntax highlighting comes for free - you know what node you're printing on the screen.
- Code completion comes for close to free - you know what node you're editing - if it's an identifier, suggest declared identifiers.
- Auto-indent/etc comes for free - the code is always pretty printed in the IDE.
- Code folding comes for free - just don't print the children of that node.
- Syntax errors are impossible - the syntax tree has to always be valid.
- Refracting comes almost for free - you simply walk the node to find all references to an identifier.
- You could work with languages that are syntactically ambiguous. (I don't know why you would do this though.)
- Potentially really fast coding - since there's only so many possibilities, the editor can suggest what goes next with only a few keystrokes.
- Cool things - like drawing boxes around structures, etc. Like Lamdu does:
Image

Cons:
- Copying and pasting might be interesting. You can only paste code in places where it would syntactically fit.
- You're enforcing code to uniformally formatted (though it could be configured in the IDE.)

Other possibilities:
- 'Strict mode' - only allow the programmer to user identifiers that have been declared somewhere. (This might get annoying, like not letting you delete a declaration of a variable that might be used somewhere.)
- You could store the syntax tree in binary form instead of text. Smaller files and potentially faster compiles, but they wouldn't be human readable in another text editor.

Since I'm designing a language for my operating system, a structure editor (and possible storing it on disk in binary format) might be interesting.
Thoughts?

Re: Structure editors

Posted: Sat Aug 01, 2015 9:18 pm
by alexfru
I don't like some of the cons. I remember it was a pain in the @$$ to modify the static DSP/BIOS configuration in TI's Code Composer Studio. Like, you couldn't swap two IRQ numbers on two ISRs easily because it just wouldn't let you use the same number twice. So you had to first free one value by using a third one in its place, then use the freed value on the other ISR, then come back and replace the third value, which you never needed in the first place, with the second, which just got freed. And if you use up all IRQs, this won't work and you'll probably need to remove one ISR first. I also remember the pain of using the formula (equation?) editor in earlier versions of Microsoft Word. You had to know precisely what you want to enter and you had to enter it very carefully and in a specific order because it was more like a once-writer (or, perhaps, a never-writer) than an editor. It seems to be better now, but I think nothing beats how it was handled in Star Office, where you could enter it as text and immediately see what it would look like rendered. You really don't want to be too strict w.r.t. what the user can enter or edit, how and where. When the user can't get something done with your editor, they'll look for other editors or tools. And if you choose to use a binary format or something poorly readable or editable at the low level, chances are you won't gain much audience if any because there must be an escape way. It looks like this is what the standard UNIX/Linux apps provide. While none is perfect and not always suitable for what you want to do, you can often combine several of them to get your job done. What's your escape plan?

Re: Structure editors

Posted: Sat Aug 01, 2015 10:12 pm
by Rusky
The most interesting structured editors I've seen have gone beyond just editing an AST to enable much higher-level operations. For example, the Subtext research project has something called "schematic tables," which directly shows the data flow graph using columns for conditionals.

This allows the editor to show the logic in ways that would be really convoluted and repetitive with text, making code folding a lot more powerful:Image

It lets you see an entire function call at once, instead of stepping through it in a debugger:Image

And my favorite part, it makes complicated logic much easier to read and edit, dynamically tracking gaps and overlaps in which logical cases are handled, adding new slots for you to fill in on the fly, and letting you drag chunks of code horizontally between different sets of conditions for when it should run:Image
Image

This all comes from the schematic tables paper and video.

More examples are from Bret Victor's work: Inventing on Principle, Learnable Programming, and Media for the Unthinkable. He does a similar live non-stepping debugger to Subtext's, live coding (as seen in Apple's Swift Playground, which he inspired), and rewindable instant-playback debugging (as seen in Elm's debugger, which he also inspired). The videos go into a lot more depth and have some really mind-bending ideas that you really need to watch to understand, but which would make editors much more easy to use.

Re: Structure editors

Posted: Wed Aug 05, 2015 3:40 pm
by AndrewAPrice
alexfru wrote:I don't like some of the cons. [...] Like, you couldn't swap two IRQ numbers on two ISRs easily because it just wouldn't let you use the same number twice. So you had to first free one value by using a third one in its place, then use the freed value on the other ISR, then come back and replace the third value, which you never needed in the first place, with the second, which just got freed. And if you use up all IRQs, this won't work and you'll probably need to remove one ISR first.
Yeah - that's the 'strict mode' I mentioned which I also dislike. Duplicate identifiers doesn't mean an invalid AST structure, and should be allowed. The IDE could still be intelligent and underline it or something, but it shouldn't prevent you from doing something basic like letting you refer to methods before you create them, etc.
alexfru wrote:I also remember the pain of using the formula (equation?) editor in earlier versions of Microsoft Word. You had to know precisely what you want to enter and you had to enter it very carefully and in a specific order because it was more like a once-writer (or, perhaps, a never-writer) than an editor. It seems to be better now, but I think nothing beats how it was handled in Star Office, where you could enter it as text and immediately see what it would look like rendered. You really don't want to be too strict w.r.t. what the user can enter or edit, how and where.
Sounds horrible. I'd hate a 'write-once' editor too. But I don't see how that's relevant to if we're using a textual or structural editor?
alexfru wrote:When the user can't get something done with your editor, they'll look for other editors or tools. And if you choose to use a binary format or something poorly readable or editable at the low level, chances are you won't gain much audience if any because there must be an escape way. It looks like this is what the standard UNIX/Linux apps provide. While none is perfect and not always suitable for what you want to do, you can often combine several of them to get your job done. What's your escape plan?
The file could be stored in text just by 'pretty printing' it back to text so other text editors can read it. Unless your language was fancy like Rusky was showing - which is what I am kind of getting at (floating comments, etc.) in which case you'd still probably want a textual way to represent it so you can share code online, use the merge features in source code repositories, use pastebins, etc.

Re: Structure editors

Posted: Sun Aug 09, 2015 2:56 pm
by AndrewAPrice
I'm going to experiment with making a structural editor, and a unique programming language that takes advantages of the features.

There is so much flexibility in regards to what becomes possible when source code is no longer limited to text:

- Comments that can be placed to the side or arbitrarily float over the code with arrows pointing to what the comment is referring to.

Image

- Show expressions in mathematical form rather than linear text.

Image

- Rapid input. I still want writing code to be a primarily keyboard activity (no messing with click-and-drag GUIs), but if your cursor is at the module level, then you should be able to type 'f' and it'll automatically add a function, 's' and it's automatically static, etc. Highlighting and copy and paste should work as you expect with a textual editor.

- Intellisense is free as the editor has the symbol table in memory and knows when you're entering an identifier.

- Collaborative editing/revision control - if multiple users are editing the same document, you can send their actions across the network as high level commands.

- The textual representation printed on the screen can be in a form that's hard to parse but easy for humans to read.

- Visual debugging/edit-and-continue. The editor can interpret the AST, with the possibility to edit the AST/symbol table/objects in memory at runtime.

- Fast compile speeds. The compiler can skip the parsing stage, and produce immediate code straight from the AST.

- Compile time errors are immediately visible during input. Allow incorrect programs during input (such as two symbols sharing the same name, or referring to an undeclared symbols), but the editor knows where the errors are before the programmer attempts to compile the project.

- The language can incorporate graphical elements (arrows, symbols, colours, fonts) to represent different elements that could not be representing easily in a textual language.

Re: Structure editors

Posted: Sun Aug 09, 2015 7:13 pm
by Brendan
Hi,

Just some random comments...

I've been thinking similar things for a while now. Mostly; there are benefits from structured editors (even with "plain text" as the source code file format), and there are benefits to using a binary source code file format (even with a normal/non-structured editor); but by combining both (a structured editor with a binary source code file format) it's more than just the sum of the parts - you get the flexibility to add many more benefits.

Some ideas:
  • The editor can pre-optimise some things and store both the "original" and "pre optimised" versions in the source file. A simple example would be constant folding and propagation - e.g. you might store "x = y * MYCONSTANT/2 + 89", but (assuming "MYCONSTANT = 44") also store "x = y * 22 + 89".
  • The editor can pre-check some things (not limited to syntax). For example, for "x = y" the editor can check that the variables have compatible types.
  • Both the pre-optimising and pre-checking can be done in the background while the user is editing the source code (rather than wasting CPU time while the user is editing and then having to do everything at compile time).
  • The compiler can insert compiled versions of things (e.g. functions) back into the source code (with a list of things that the compiled version depended on), where the editor invalidates any compiled versions if/when anything it depended on is modified. This completely avoids the need for tools like "make" while giving similar "only recompile the parts that changed" benefits (in a superior/finer granularity way).
  • You can build unit tests directly into the source code. For example, you might have the function "foo" plus several unit tests that test that function; and when you're finished editing the function "foo" the IDE might compile and run the unit tests in the background and let you know if/when a unit test fails (without the user doing anything at all, or possibly where the user only presses a "run tests" button).
  • The built in tests can be used in more powerful ways. For example, maybe the user right clicks on a function and selects "show variables over time" and the IDE compiles a default unit test but injects instrumentation into the function so it can gather statistics, and then generates graph/s showing the values in all variable as the function executes.
  • In a similar way; if you allow the user to nominate a test as "representative of expected usage"; then the IDE might be able to compile the test and profile it then add performance hints back into the source code; allowing for seamless "piecemeal profiler guided optimisation" but also allowing the IDE to show the user any hot spots in the code.
Cheers,

Brendan

Re: Structure editors

Posted: Mon Aug 10, 2015 11:01 am
by AndrewAPrice
Brendan wrote:I've been thinking similar things for a while now. Mostly; there are benefits from structured editors (even with "plain text" as the source code file format), and there are benefits to using a binary source code file format (even with a normal/non-structured editor); but by combining both (a structured editor with a binary source code file format) it's more than just the sum of the parts - you get the flexibility to add many more benefits.
I'm tossing between a binary or text format for storage.

A binary format is obviously smaller and you can compress the data. Or we could store the source code as text, which doesn't have to look anything like what the user sees in the editor, only because it would be nice to keep it compatible with existing revision control software - merging two people's changes to a single file, for example.

Some kind of text serialization would be nice for other reasons - pasting code onto this forum or paste bin.

As far as text serialization of a non-textual language we could either store it in the form of data structures (XML, JSON, etc) or alternatively because I want my editor to be keyboard based, we could serialize it in the form of the keystrokes you'd press to reconstruct it. Pasting merely replays the keystokes.

Does this mean I'll also need a web-based code viewer, that you can paste code into from forums and pastebins to see what it actually looks like in the editor?

The precheck is a great idea. I don't want to prevent type errors during editing (it would make it difficult to write code if your code always had to be in a compilable state), but the editor could certainly underline type errors during type-time.

I think the pre-optimisation stuff is interesting. My main motivation isn't faster build times, but if this stuff is made simple to implement, I'll take it.

Yes, I want to get rid of tools like make. You should be able to just open a project in the IDE and click 'build', with an equivalent that can be automated from the command line. You should be able to add dependencies easily in the IDE (either another project in source form, or the built binary library), and when you build a project it makes sure that the dependencies are built and up to date first.

I love the idea of live coding. If we are able to execute the same AST that the editor manipulates, why can't we modify it while it's running in debug/interpret mode? Image a game loop and being able to adjust the physics constants as it's running to see what feels right. (Interpreting the AST is in the editor only, the language will be compiled when you actually build it.)

Re: Structure editors

Posted: Mon Aug 10, 2015 2:17 pm
by SpyderTL
MessiahAndrw wrote:As far as text serialization of a non-textual language we could either store it in the form of data structures (XML, JSON, etc)
AHEM!... http://forum.osdev.org/viewtopic.php?f= ... 15#p235888

Not exactly the solution you are talking about, but I've been "promoting" using XML as a "storage" format for a front-end language for quite a while now.

Re: Structure editors

Posted: Mon Aug 10, 2015 2:38 pm
by Rusky
MessiahAndrw wrote:A binary format is obviously smaller and you can compress the data. Or we could store the source code as text, which doesn't have to look anything like what the user sees in the editor, only because it would be nice to keep it compatible with existing revision control software - merging two people's changes to a single file, for example.
This doesn't mean you can't have a text serialization for compatibility, but diffing and merging at the structural level has the potential to be much nicer for version control systems that are capable of it (perhaps with pluggable diff/merge tools for your format(s)).
MessiahAndrw wrote:Some kind of text serialization would be nice for other reasons - pasting code onto this forum or paste bin.

As far as text serialization of a non-textual language we could either store it in the form of data structures (XML, JSON, etc) or alternatively because I want my editor to be keyboard based, we could serialize it in the form of the keystrokes you'd press to reconstruct it. Pasting merely replays the keystokes.

Does this mean I'll also need a web-based code viewer, that you can paste code into from forums and pastebins to see what it actually looks like in the editor?
Maybe a serialization like LaTeX for mathematical notation- math-centric websites tend to use JavaScript to display that but it's still mostly-decipherable by humans without that. Another comparison that comes to mind is Markdown, where the unprocessed form tries to match patterns people already use in plain-text communication.

Re: Structure editors

Posted: Mon Aug 10, 2015 3:16 pm
by xenos
By the way, for LaTeX there exists a structure editor named LyX. I have never used it (I write LaTeX using Kile, which is a text editor adapted for LaTeX), and I don't know how LyX works under the hood, but it might be worth having a look.

Re: Structure editors

Posted: Mon Aug 10, 2015 7:47 pm
by Brendan
Hi,
MessiahAndrw wrote:
Brendan wrote:I've been thinking similar things for a while now. Mostly; there are benefits from structured editors (even with "plain text" as the source code file format), and there are benefits to using a binary source code file format (even with a normal/non-structured editor); but by combining both (a structured editor with a binary source code file format) it's more than just the sum of the parts - you get the flexibility to add many more benefits.
I'm tossing between a binary or text format for storage.

A binary format is obviously smaller and you can compress the data. Or we could store the source code as text, which doesn't have to look anything like what the user sees in the editor, only because it would be nice to keep it compatible with existing revision control software - merging two people's changes to a single file, for example.
Most of the data is naturally a hierarchical tree, and requires "conversions" to store as a sequence of bytes (a file). The main problem with text is that these conversions (between "hierarchical tree" and "sequence of bytes") are very slow.

For example, I stored my AST (in memory) using a generic structure for each node, sort of like this:

Code: Select all

typedef struct AST_node {
    uint32_t type;
    struct AST_node *parentNode;
    struct AST_node *firstChildNode;
    struct AST_node *nextSiblingNode;
    int dataLength;
    uint8_t *data;
} AST_NODE;
To serialise it, each node got stored as a "length, type, data" record, and to rebuild the tree structure I added 2 flags to the type field - one for "this node is first child" and one for "this node is last child". This meant I could have generic conversion code (that doesn't care what any node is) that's fast.

The other problem is that for text you can't index it in a sane way. In memory; I had a "list of types" (with a pointer to the top AST node for the type's definition), and a "list of symbols" (with a pointer to the top AST node for the symbol's definition). In the file I had the same, except the pointers were "offset in file for top AST node" instead. This means that if you want to find the function "foo" in the file you search the symbol table and find the "offset in file for function foo's top AST node" and can just decode the nodes for "function foo" alone (without loading most of the file from disk and without parsing all of the data for all of the nodes).
MessiahAndrw wrote:Some kind of text serialization would be nice for other reasons - pasting code onto this forum or paste bin.
Yes; but that can be done in multiple different ways (e.g. having an "export/import as text" feature in the IDE, having a separate stand-alone utility to convert between file formats and/or extract specific pieces by name, etc). Note that if you do use "plain text" it won't be useful for cut & paste anyway, because it will be littered with things people don't want to see. For example, instead of seeing this in the text file:

Code: Select all

int foo (int y) {
    if(y == 0) return 1;          // Just in case
    return y/2;
}
The text file might contain this:

Code: Select all

<function><hint_inlined><hint_no_side_effects>int foo (int y)<new_child>if<hint_likely_not_taken><new_child><condition>(y == 0)<last_child><new_child><statement>return 1<comment>Just in case<last_child><last_child><new_child><statement>return y/2<last_child><last_child>
MessiahAndrw wrote:The precheck is a great idea. I don't want to prevent type errors during editing (it would make it difficult to write code if your code always had to be in a compilable state), but the editor could certainly underline type errors during type-time.
For me, an error was mostly the same as a comment (e.g. just a string of "anything" characters) except it had a different value for the node's "type". Of course it makes sense to index the errors so you can quickly find the locations of all errors; and so that (e.g.) when the user defines a new symbol (function name, variable name) you search the list of errors and determine if any of them aren't errors any more. That also means the compiler can quickly generate a list of errors without loading or decoding most of the file.
MessiahAndrw wrote:I think the pre-optimisation stuff is interesting. My main motivation isn't faster build times, but if this stuff is made simple to implement, I'll take it.

Yes, I want to get rid of tools like make. You should be able to just open a project in the IDE and click 'build', with an equivalent that can be automated from the command line. You should be able to add dependencies easily in the IDE (either another project in source form, or the built binary library), and when you build a project it makes sure that the dependencies are built and up to date first.
I want to stream-line everything. Instead of wasting time learning/writing/maintaining makefiles, having a directory of separate source files and separate object files, having a separate linker, and having slow compile times; I want "single file contains source for single executable" with very fast compile times.
MessiahAndrw wrote:I love the idea of live coding. If we are able to execute the same AST that the editor manipulates, why can't we modify it while it's running in debug/interpret mode? Image a game loop and being able to adjust the physics constants as it's running to see what feels right. (Interpreting the AST is in the editor only, the language will be compiled when you actually build it.)
There's no reason you can't have a virtual machine that executed/interprets the AST that does allow live coding (in addition to a compiler that generates fast code that can't support live coding).


Cheers,

Brendan

Re: Structure editors

Posted: Tue Aug 11, 2015 7:19 am
by AndrewAPrice
Brendan wrote:I want to stream-line everything. Instead of wasting time learning/writing/maintaining makefiles, having a directory of separate source files and separate object files, having a separate linker, and having slow compile times; I want "single file contains source for single executable" with very fast compile times.
I see no reason why we couldn't do a single file containing an entire project (namespaces and all). Can you think of a reason? The only problem I see is that it won't work easily with version control software like SVN/GIT because they see it as a single binary blob. I'd need to make my own tools for diff, patch, etc.
Brendan wrote:
MessiahAndrw wrote:I love the idea of live coding. If we are able to execute the same AST that the editor manipulates, why can't we modify it while it's running in debug/interpret mode? Image a game loop and being able to adjust the physics constants as it's running to see what feels right. (Interpreting the AST is in the editor only, the language will be compiled when you actually build it.)
There's no reason you can't have a virtual machine that executed/interprets the AST that does allow live coding (in addition to a compiler that generates fast code that can't support live coding).
This would be relatively easy to implement if the IDE already has the entire syntax tree in memory.

You probably know what I'm trying to accomplish here. I've already had successful experiments making my own language, compiler, and virtual machine. When I talk about getting rid of threads and using my own task/RPC based execution model, it's something that could be done simpler in a language that natively has these constructs.

Since I'm already doing my own thing (language, compiler) and deep down the rabit hole, I may aswell have my own much simpler build system. If I'm doing all that, should integrate it into "Perception Studio" that comes bundled with my OS - a full editor, build system, dependency system, deployment system, etc that all libraries and applications for my OS are built in. I don't know I'd I'm crazy, but I can dream.

Re: Structure editors

Posted: Tue Aug 25, 2015 12:56 am
by onlyonemac
I was thinking of writing something like this for blind users such as myself, as editing structured code varies from "difficult" to "challenging" when you can only hear one line at a time and you have to try to keep all the other lines of code in memory and remember which nested block you are in. So something like, use the arrow keys to navigate, press this key to enter a block and that one to exit, press another key to edit the current line, another to read the opening statement of the current block (e.g. if you're in a "while" loop it will read out the "while" statement at the start of the block) and more keys to work "intellegently" with things like "if" statements e.g. to edit an "if" statement not just as a line of code but in terms of it's actual conditional parts, if that makes sense.

Re: Structure editors

Posted: Tue Aug 25, 2015 8:25 am
by AndrewAPrice
@onlyonemac - What sort of feedback would your ideal structure editor give you? Could you write out an example of what you'd like to hear as you're editing and navigating your code?

Re: Structure editors

Posted: Tue Aug 25, 2015 9:56 am
by iansjack
What about pictures. How do you convert them to audio so that a blind person can recognize them?