OSDev.org

Posted: **Fri Jun 13, 2014 3:29 am**

Hi,

SpyderTL wrote:I'm not writing an IDE. There are plenty out there that are very good at editing XML files. If I can use Visual Studio, or Eclipse, or XCode, or XmlPad, and they already have intellisense, autocomplete, tooltips and can transform XML files using XSLT, why not just use one of them?

Intellisense, autocomplete and tooltips are only minor conveniences; while hideously unusable syntax is an extreme inconvenience. The advantages don't justify the disadvantages in the slightest.

SpyderTL wrote:The whole XML idea is based on the idea that code is actually data, and XML happens to be really good at storing data.

Let's talk about plain text and efficiency.

For software to "understand" (decipher) plain text you need a scanner and a parser (and it should be no surprise that scanner and parser are the first 2 pieces of a typical compiler). This isn't just for compilers though - any decent IDE will also need a scanner and parser to do "fancy features" like simple syntax highlighting (and intellisense, autocomplete, tooltips, etc). By storing source code in a binary form (e.g. as "tokens") you don't need a scanner and can improve the efficiency of both IDE and compiler. In addition, (at least in my experience) this reduces source code file sizes to about 80% of the "plain text" equivelent.

Note: You can go one step further and store source code as an abstract syntax tree (in binary form). This is even more efficient (less processing in IDE and compiler) and more flexible (as it's language syntax independent). However; it turns out that when programmers are writing source code there's a lot of temporary/intermediate states where the source code isn't valid (or representable in AST); and storing source code as AST ends up being inconvenient. For a simple example, a programmer might write "if (foo == bar) { " (without any '}') and then save the file and go to bed; and then add more code (including the missing '}') the next day.

Basically, plain text is bad. In 1960 someone was too lazy to define a proper source file format and provide a suitable IDE, so they made it work with generic text editors. This became "the way it's done"; and we've been cursed with poor tools and poor efficiency ever since due to a combination of cowardice and laziness. Ironically, tools programmers create for other people (e.g. word-processors, spreadsheets, image file editors, etc) all have suitable file formats and editors (and don't use "plain text") - it amazes me that programmers can't do the same for their own tools.

Now; the difference between plain text and XML is that XML is worse for everything. It's less human readable for people, and less efficient (more expensive for scanning and parsing, and larger file sizes) for computers. The only "benefit" that XML has is that it allows the developer to sacrifice quality to reduce development time. Basically, XML/XSLT are tools that allow lazy/incompetent people to create crap.

SpyderTL wrote:And XSLT just happens to be really good at "transforming" a single XML node into one or more different XML nodes, which just happens to be really useful when you want to "compile" a program into a bunch of byte codes.

XSLT might be good at transforming something neither humans nor computers want into something else that neither humans nor computers want. This is completely and utterly useless for when you want to "compile" a program into a bunch of byte codes (or any other case that involves transforming something humans want into something computers want).

Cheers,

Brendan

Posted: **Fri Jun 13, 2014 6:35 am**

Brendan wrote:For an accurate descriptive name it should be:
Code: Select all
<cpu:CopyByteFromAddressInDS:SIOrDS:ESIDependingOnAddressingModeToALWithoutEffectingFLAGSAndThen.../>

Brendan, you are really good at it!

But to the discussion:

Brendan wrote:Before a beginner can use an instruction they have to know it exists and what it does

Yes, of course. And until there is Intel's manual and a lot of examples in plain old assembly everyone will be confused with new syntax.

But if, for example, SpyderTL will teach his children to write low level code without Intel's manual, then he's approach can help a bit in the beginning. Unfortunately, a bit later children will want to google some new feature for their OS and find only assembly examples.

Or if there will be a new processor with documented opcodes only, then it is possible to use SpyderTL's approach.

But we all know, that this is not the situations we expect to see anywhere on earth. So I recommend SpyderTL to think about the great legacy of documentation and examples in assembly, and not to forget about great pain of any person, acknowledged to the docs and samples.

Posted: **Fri Jun 13, 2014 6:37 am**

SpyderTL wrote:Your XSLT would have to be awfully smart in order to pick the right one.

So, you trade development speed for quality. Do you want to produce low quality OS?

Posted: **Fri Jun 13, 2014 6:45 am**

SpyderTL wrote:
embryo wrote:...And it seems you have decided to split uniformity of names into something more close to the actual machine code.
That is precisely what I did.

But then you ask others to learn processor's internal commands. It means you lower the abstraction level of assembly to the lowest possible one of machine codes. But is your XML OS a high level thing?

SpyderTL wrote:
embryo wrote:But why not to have plain text result in hexadecimal form and then to convert it into raw bytes using C# or anything else? Then you can have all build stages in XML, except the last and simplest - hex to byte translation.
That is precisely what I'm doing.

Here I was talking about minimization of C# usage. If C# works only as a hex to binary converter, then you can name your OS as pure XML OS (except the last conversion).

Posted: **Fri Jun 13, 2014 6:49 am**

SpyderTL wrote:
embryo wrote:For a beginner without any assembly background it is a bit viable. But for an any more experienced programmer it is a pain.
I think the same could be said of any programming language.

No. There is no useful language without examples and documentation. It means you are denying all the documentation and examples (in assembly) for a low level developer in XML.

Posted: **Fri Jun 13, 2014 6:55 am**

SpyderTL wrote:I could not find any tools that would let me write ASM code and that provided any sort of modern development assistance (intellisense, context sensitive help, etc.) So I wrote my own.

It's a bit of advertising for me, but you can use jEmbryoAssembler project to get all those "Autocomplete, Intellisense, Documentation, View Definition, Inline Functions...". But it is in Java. However, you can rewrite it in C#, it's really easy. And one more point - the names there are pure assembly without any complains about some complexity of translation.

Posted: **Fri Jun 13, 2014 7:00 am**

Brendan wrote:Let's talk about plain text and efficiency.

It seems you are talking about code visualization. But XML storage for a code visualized in any desirable manner is very close to your proposal. Of course, the efficiency of storage utilization is out of scope here, but we have terra-byte sized disks for it.

Posted: **Fri Jun 13, 2014 8:41 am**

Hi,

embryo wrote:
Brendan wrote:Let's talk about plain text and efficiency.
It seems you are talking about code visualization. But XML storage for a code visualized in any desirable manner is very close to your proposal. Of course, the efficiency of storage utilization is out of scope here, but we have terra-byte sized disks for it.

I wasn't really talking about visualisation. Let's look at all the use cases - for any kind of source file format:

A compiler has to read it and check it (e.g. first few steps of compiling). For this case XML is worse than plain text, and plain text is worse than other alternatives.
It has to be stored on disk, transferred between computers, etc. For this case XML is worse than plain text, and plain text is worse than other alternatives.
It has to be displayed to humans (who don't read "binary bytes", regardless of whether those binary bytes represent characters or not), and entered/edited by humans; typically in a kind of "feedback loop" (e.g. press some keys, see the changes, repeat until it looks like what you want). For this case XML is worse than plain text, and plain text is worse than other alternatives.
It needs to be read and understood by miscellaneous tools (e.g. source level debugger, possibly tools to extract documentation from the source, revision control systems, etc). For these cases XML is worse than plain text, and plain text is worse than other alternatives.

Please note that where I say "worse than" above, I mean less efficient and worse for the end user. I don't mean worse for the person developing the tools. If you are the person creating the tools and only care about how long it takes (and hate the potential users of your tools so much that you don't care how much they suffer), then XML isn't "worse" at all. However, this is not how the design of tools (or the design of anything, ever) should be done.

Cheers,

Brendan

Posted: **Fri Jun 13, 2014 9:14 am**

Working with big data and people that have never used XML before, XML is a) very hard to get your head around if you've never encountered with it before, and b) takes up a lot of storage (and compression is often out of the question because throughput performance is a requirement.)

I've often found myself in situations having to walk people through the XML syntax and layout (namespaces, tags, properties, children) and explain how XSL and XSD files work. It's less intuitive for first timers than JSON, S-expressions, YAML for representing tree data. Thanks the explosion of interest in AJAX in the last 6+ years - there are plenty of great JSON tools out there too:

vs:

Also, I haven't found a good XML IDE yet. Even ones with autocomplete, intellisense, etc, such. They still show the whole XML file (closing tags and all) so it's still very verbose. And ones that try to help you by automatically writing the closing tags for you irritate me because - I end up double writing the closing tags, it tries to handle "TAB" (I'm either trying to indent my code, or jump out of the existing tab, and the IDE usually does the opposite or autocompletes what my cursor was currently on), and it requires a lot of cursor movement to move around code. Generally, all of these problems come from the fact that XML makes you
close your tags - having to retype (or jump over) the entire tag name (which exaggerates the problem if you use tags like PackAndLoad32BytesFromEAXIntoStack), compared to simply typing ), }, ].

About the best feature is being able to expand/collapse XML tags, although it can get annoying when everything is either expanded or collapsed by default and there's a whole bunch of clicking to fit something. Most IDE's support this anyway - for example Visual Studio allows you to collapse C# functions and preprocessor regions.

Brendan wrote:I don't mean worse for the person developing the tools.

Just as easy for someone using: YAML, JSON (halfway down), Xupl.

Posted: **Fri Jun 13, 2014 9:30 am**

Wow...

then, How do you compile it? The code

Code: Select all

<program>
    <con:WriteCharacters>abc</con:WriteCharacters>
    <cpu:cli />
    <cpu:hlt />
</program>

must be compiled like this, isn't it?

Code: Select all

[bits 16]
[org 0x7c00]
mov ax, 0xb800
mov es, ax
xor di, di
cld
mov ax, 'a' | (0x07 << 8)
stosw
mov ax, 'b' | (0x07 << 8)
stosw
mov ax, 'c' | (0x07 << 8)
stosw
cli
hlt

Do you use your own "compiler"? Or is there any easy and simple way to do this?

Posted: **Fri Jun 13, 2014 9:56 am**

Brendan wrote:Basically, plain text is bad. In 1960 someone was too lazy to define a proper source file format and provide a suitable IDE, so they made it work with generic text editors. This became "the way it's done"; and we've been cursed with poor tools and poor efficiency ever since due to a combination of cowardice and laziness. Ironically, tools programmers create for other people (e.g. word-processors, spreadsheets, image file editors, etc) all have suitable file formats and editors (and don't use "plain text") - it amazes me that programmers can't do the same for their own tools.

I kind of agree with you. I once had a really big interest in graphical programming languages, e.g.:

In fact, graphical programming is useful for 'flow based programming' where you have modules with inputs and outputs, and you're able to link them together by drawing lines, even though in these systems the modules themselves are actually textual scripting languages or programs.

However, the problem with graphical languages that I've noticed that except in domain specific contexts (like flow based programming) is that they don't tend to make things simpler. In a general purpose language, you need to deal with type systems, defining objects, control flows, and then sometimes you run out of room (literally, spatial room) to fit in something, so you spend 10 minutes dragging around your other 'control blocks' to make it fit. By that time, the whole thing is messy. But at least it makes compiler development easier because the editor can pass the preloaded data structure to the compiler and there's no parsing or lexing.

Graphical languages can make it impossible to use invalid syntax (it'll only let you draw and places what is a valid program, compared to free form text) but once you acquire a programming language, I'm more so thinking about the design of the software I'm writing and algorithms, than figuring out the syntax. If you're spending most of your development time figuring out syntax, then I'd blame the language.

General purpose graphical programming languages are a fun novelty for simple things, but imagine trying to write a serious algorithm in it.

I think there is a lot of interesting research in IDEs going on.

For example - probablistic parses for dynamic languages. For example, in a static language like C++ you can parse the file, and without running it, you pretty much know every identifier, class type, where it's defined, etc. In dynamic languages, you usually can't be guaranteed to know the datatype until runtime (and even then it can change) and properties are dynamically assigned to objects.

However, probabilistic parsing depend on detecting 'patterns', such as:

Code: Select all

a = []; // it's probable that 'a' is an array
b = function(a, b) {
}; // it's probable that 'b' is a function that takes two parameters
c.someHandler = function() {
}; // it's probable that 'c' has a member called 'someHandler' that is a function that takes no parameters
d = b; // it's probable that 'd' is the same as 'b'

There are some cases where it fails (example: your object's property's name is dependent on some value generated by an algorithm that varies at runtime) but you can make a pretty good 'guess' that works for 95% of typical use cases just by detecting patterns in source code.

There are other useful things IDEs can do to compliment textual code. Offer features like self-documention, refactoring, finding all references, jumping to declarations, fixing indentation, collapsing code blocks, resolving dependencies, GUI designers and class designers that generate code, debugging, edit-and-continue, database browsing.

Posted: **Fri Jun 13, 2014 1:12 pm**

Brendan wrote:Intellisense, autocomplete and tooltips are only minor conveniences;

Whoa! That's quite a statement.

Brendan wrote:while hideously unusable syntax is an extreme inconvenience. The advantages don't justify the disadvantages in the slightest.

I'm, obviously, going to go ahead and, sort of disagree with you there...

Brendan wrote:By storing source code in a binary form (e.g. as "tokens") you don't need a scanner and can improve the efficiency of both IDE and compiler. In addition, (at least in my experience) this reduces source code file sizes to about 80% of the "plain text" equivelent.

Yes, byte code is physically smaller than ASM text files. And .NET IL is physically smaller than C# text files. Binary files are smaller than text files. But no one uses a keyboard to type in binary files. You need a significant user interface in order to "translate" the binary data into something that humans can easily manipulate. Significant user interfaces were pretty difficult to come by in the 50's.

Brendan wrote:Basically, plain text is bad. In 1960 someone was too lazy to define a proper source file format and provide a suitable IDE

That may have been due to the fact that you were limited to about 2-4KB of memory...

Brendan wrote:, so they made it work with generic text editors. This became "the way it's done"; and we've been cursed with poor tools and poor efficiency ever since due to a combination of cowardice and laziness. Ironically, tools programmers create for other people (e.g. word-processors, spreadsheets, image file editors, etc) all have suitable file formats and editors (and don't use "plain text")

That's probably because that code was written in 2014, not 1965.

Brendan wrote: - it amazes me that programmers can't do the same for their own tools.

Same here. That's kind of why I decided to go back and revisit all of those design decisions. But I'll admit that I've tried to come up with a non-text development strategy, and it turns out that it's not quite as easy as you would think. Did you have something specific in mind as a user interface?

Brendan wrote:Now; the difference between plain text and XML is that XML is worse for everything.

Oh boy.

Brendan wrote:It's less human readable for people

... than what? 3 character instruction mnemonics?

Brendan wrote:, and less efficient (more expensive for scanning and parsing, and larger file sizes) for computers.

Yeah, at build time. Who cares if builds take 500 ms longer?

Brendan wrote:The only "benefit" that XML has is that it allows the developer to sacrifice quality to reduce development time. Basically, XML/XSLT are tools that allow lazy/incompetent people to create crap.

Developers are perfectly capable of writing crap code in any language that I'm aware of. Preventing that at the language level would be pretty impressive.

I've already listed numerous "benefits" to using XML in this thread. You'll have to go back and find them... But suffice to say that your statement that XML only has one benefit (reduce development time) has not convinced me.

Brendan wrote:XSLT might be good at transforming something neither humans nor computers want into something else that neither humans nor computers want. This is completely and utterly useless for when you want to "compile" a program into a bunch of byte codes (or any other case that involves transforming something humans want into something computers want).

Well, I've been using it for about 5 years. And I'll admit that I'm a "lazy developer". (Which is actually a good trait when it comes to business development, IMO. Lazy developers aren't going to waste time writing code that isn't strictly necessary...)

As a lazy developer, writing Assembly in Notepad does not appeal to me. If I could use C# and Visual Studio to write an operating system, I would. I agree that text files aren't the way to go, and, assuming there was another feasible solution 50 years ago, a poor design choice. That's why I have been trying to find an alternative. XML is the best I've found, so far. But if you have a better idea (than text files), I'd love to see it.

Thanks.

Posted: **Fri Jun 13, 2014 1:15 pm**

embryo wrote:So I recommend SpyderTL to think about the great legacy of documentation and examples in assembly, and not to forget about great pain of any person, acknowledged to the docs and samples.

So, don't create a new programming language, because Assembly is so well documented?

Maybe "programming language" is a misnomer, at the level we are talking about. Maybe "XML-based Processor Instruction Enumeration and Sequence Definition" makes more sense. XPIES?

Posted: **Fri Jun 13, 2014 1:22 pm**

embryo wrote:
SpyderTL wrote:Your XSLT would have to be awfully smart in order to pick the right one.
So, you trade development speed for quality. Do you want to produce low quality OS?

Well, no. But how many "high quality" pure Assembly Operating Systems can you name off of the top of your head?

Quality has nothing to do with the tools you use. Or, more specifically, good tools don't necessarily produce good products.

Posted: **Fri Jun 13, 2014 1:33 pm**

embryo wrote:
SpyderTL wrote:
embryo wrote:...And it seems you have decided to split uniformity of names into something more close to the actual machine code.
That is precisely what I did.
But then you ask others to learn processor's internal commands. It means you lower the abstraction level of assembly to the lowest possible one of machine codes. But is your XML OS a high level thing?

The OS isn't an XML OS. It's an Object Oriented OS, just like yours. (Kind of like yours...) But the code that I wrote to "create" the OS is (mostly) XML. That's why I said that this thread should have been split up into two separate threads: one about the low level XML "language", and one about the high-level Object Orented OS.

embryo wrote:But why not to have plain text result in hexadecimal form and then to convert it into raw bytes using C# or anything else? Then you can have all build stages in XML, except the last and simplest - hex to byte translation.
SpyderTL wrote:That is precisely what I'm doing.
Here I was talking about minimization of C# usage. If C# works only as a hex to binary converter, then you can name your OS as pure XML OS (except the last conversion).

I assure you, if I could figure out a way to get rid of C# altogether, I would. But I don't know of any other way to write out bytes from XML. Also, I don't feel like running all of the XSLT's by hand, every time I make a change to a single file.
Plus, what you are suggesting is sort of like saying "Why are you using NASM to compile your OS? Why not just use Assembly?"

My XSLT's can't run themselves.

OSDev.org

Object Oriented OS

Re: Object Oriented OS

Re: Object Oriented OS

Re: Object Oriented OS

Re: Object Oriented OS

Re: Object Oriented OS

Re: Object Oriented OS

Re: Object Oriented OS

Re: Object Oriented OS

Re: Object Oriented OS

Re: Object Oriented OS

Re: Object Oriented OS

Re: Object Oriented OS

Re: Object Oriented OS

Re: Object Oriented OS

Re: Object Oriented OS