Porting to architechture with no MMU (ipod/gba ARM)

omin0us · Post by **omin0us** » Fri Nov 17, 2006 4:24 pm

So, i've been working on my OS for a while, and i just bought a new ipod, and i thought it might be fun to possibly try and port my OS to run on my ipod. or possibly the GBA i have laying around, but i am aware that there is no MMU on these two ARM devices. So my question is, how does one port an OS that depends on paging/virtual memory to run on a device that has no such support?

Brynet-Inc · Post by **Brynet-Inc** » Fri Nov 17, 2006 5:29 pm

Here is the FAQ for ipod linux, They ported uClinux - http://ipodlinux.sourceforge.net/faq.shtml

Looking through that source might help...

omin0us · Post by **omin0us** » Fri Nov 17, 2006 5:31 pm

Brynet-Inc wrote:Here is the FAQ for ipod linux, They ported uClinux - http://ipodlinux.sourceforge.net/faq.shtml

Looking through that source might help, The GBA seems like a less likely possible target.

why do you say a less likely target? they have nearly the same CPU in each. and the GBA is very very well documented for programming? i just wanted ideas on how one would make up for having no MMU in a kernel that has so far been dependant on one. :]
but i will check out that link. thanks

Brynet-Inc · Post by **Brynet-Inc** » Fri Nov 17, 2006 5:39 pm

Alright, I retracted my statement..

http://wwwhsse.fh-hagenberg.at/Studiere ... 06/uclgba/
http://wwwhsse.fh-hagenberg.at/Studiere ... index.html

omin0us · Post by **omin0us** » Fri Nov 17, 2006 6:07 pm

Brynet-Inc wrote:Alright, I retracted my statement..

http://wwwhsse.fh-hagenberg.at/Studiere ... 06/uclgba/
http://wwwhsse.fh-hagenberg.at/Studiere ... index.html

haha, thanks for the links. i was aware of uClinux. but knowing the beast of source that linux is today, i was hoping for a more simple explanation of how i might move from MMU to no MMU without too many problems.

Dex · Post by **Dex** » Fri Nov 17, 2006 6:56 pm

I am port my OS to ARM including GBA, but as my OS does not use MMU its not a problem to me

.

Colonel Kernel · Post by **Colonel Kernel** » Fri Nov 17, 2006 8:38 pm

omin0us wrote:i was hoping for a more simple explanation of how i might move from MMU to no MMU without too many problems.

What sort of problems do you anticipate?

AFAIK, there are two schools of thought on running without an MMU. One is practical and in use today, while the other is still mostly a research topic and is probably not something you want to take on if you're in any kind of a hurry.

The practical approach is to have two versions of your OS -- one that is used for application development and runs on hardware with an MMU, and the "production" version that runs on the MMU-less hardware. This is a somewhat common approach for embedded development. In terms of the OS itself, your MMU-less version can still have the concept of a "process" from the point of view of resource ownership (file handles, etc.), but everything runs in the same (physical) address space, so it is obviously less stable than the alternative.

The research approach is to implement your applications (and possibly the OS itself) in a type-safe language like Singularity does. However, this sounds like it would be overkill for what you're doing.

Brendan · Post by **Brendan** » Fri Nov 17, 2006 9:12 pm

Hi,

There's some interesting notes in the uClinux FAQ:

Q. Does uClinux support multitasking? What limitations are imposed by not
having a MMU?

A. uClinux absolutely DOES support multi-tasking, although there are a
few things that you must keep in mind when designing programs...

1. uClinux does not implement fork(); instead it implements vfork().
This does not mean no multitasking, it simply means that the
parent blocks until the child does exec() or exit(). You can still
get full multitasking.

2. uClinux does not have autogrow stack and no brk(). You need to use
mmap() to allocate memory (which most modern code already does).
There is a compile time option to set the stack size of a program.

3. There is no memory protection. Any program can crash another
program or the kernel. This is not a problem as long as you are
aware of it, and design your code carefully.

4. Some architectures have various code size models, depending on how
position independance is achieved.

The most interesting part is part 1, where it says there is multi-tasking while describing a system that I'd call single-tasking (i.e. only the most recently created task that hasn't terminated will be able to use the CPU).

Cheers,

Brendan

omin0us · Post by **omin0us** » Fri Nov 17, 2006 9:29 pm

hmmm, that does make sense i suppose. because on something like an ipod or gba, you dont really need to be running more than 1 task at a time.

Colonel Kernel · Post by **Colonel Kernel** » Fri Nov 17, 2006 10:03 pm

omin0us wrote:hmmm, that does make sense i suppose. because on something like an ipod or gba, you dont really need to be running more than 1 task at a time.

Why not? I don't see the point in having such an arbitrary limitation. It is quite possible to have full support for multi-threading even in the absence of memory protection. It's not clear to me why uClinux has this limitation.

I forgot to mention in my first post that you have to be a lot more deliberate about how you manage memory in the absence of an MMU. The quote posted by Brendan points out some of these issues -- position independent code is more important, you generally have to limit the amount of memory that can be used by each thread's stack, etc. Just some things to keep in mind.

omin0us · Post by **omin0us** » Fri Nov 17, 2006 10:17 pm

well, one concern i had with multitasking in an environment such as that, is that when compiling binaires (say elf), i only at this point compile them statically to run from a certain address, i would would have a lot more work getting them to run from any address they were loaded at. although i'm thinking there is a flag to compile code position independant? maybe i'm wrong.

Solar · Post by **Solar** » Tue Nov 21, 2006 7:08 am

Harr-Harr! It's back!

(Candy, take cover, you'll know this.

)

It is perfectly possible to have a fully multitasking OS in absence of an MMU, if your OS is designed to support it (and your hardware can fire a timer interrupt for the scheduler). It makes certain things trickier - you need support for PIC (position-independent code); a binary format that has proper relocation tables, and ideally allows splitting up a binary into smaller parts so you don't need a chunk of continuous memory as big as your complete binary; a loader that does relocations while loading a binary; and offset tables for your library functions so they can still be shared.

Point in case: AmigaOS (didn't use MMU, multitasks like a charm).

JJeronimo · Post by **JJeronimo** » Tue Nov 21, 2006 9:03 am

Solar wrote: It is perfectly possible to have a fully multitasking OS in absence of an MMU, if your OS is designed to support it (and your hardware can fire a timer interrupt for the scheduler).

I was thinking about the timer and it's absense... parhaps even without an internal timer, but provided that the architecture supports interrupts and you have an interrupt driven serial port, you could wire an external timer that would send some signal to the serial port to interrupt the processor...

Another solution is of course making the kernel interpret the programs (written in some special language or not) and implement all the necessary things in the interpreter... But it's like creating an emulator, cause your program wouldn't be really running in the native architecture!

And... An alternative solution that would solve many of these problems would be asking the processor to do single-step execution and emulating memory access... (another poor solution!)

It makes certain things trickier - you need support for PIC (position-independent code); a binary format that has proper relocation tables,

I think that's the classical way to load programs in archs without MMU...

If the architecture doesn't support executing PIC eficiently, then parhaps the best choice is relocation tables...

and ideally allows splitting up a binary into smaller parts so you don't need a chunk of continuous memory as big as your complete binary;

I'm not quite sure there is an executable format designed to do this!

a loader that does relocations while loading a binary; and offset tables for your library functions so they can still be shared.

In my (purely theoric) opinion, address space sharing makes it much simpler to implement shared libraries than with address space separation...
That's because if you have everything in the same address space then you only need to do load time linking, cause it isn't necessary to find free memory in the address spaces of each process that uses the library...

For example (and the sizes are examples too!), if your Linux app uses glibc (1 KB) and GTK+ (5KB), the kernel will map and relocate one after the other (if the kernel maps shared libraries starting at 0x10000000, GLIBC would be from 0x10000000 to 0x10000400 and GTK+ from 0x10001000 (page aligned...) to 0x10002400)...
If you execute another program that needs the same libraries... the kernel can reutilize the pages with the relocated libraries loaded in it...
Now suppose that there is another program that doesn't need GLIBC, but instead uses another library, that consumes 6KB (a page and a half), and also GTK+... the kernel will need to either reload and relocate GTK+ to one page forwards from the previous location or relocate and map the 6KB libarary at 0x10003000...

Though, I think it's not a very common problem...

JJ

PS: I'm too lazy now to review my post, so don't strange if you read a lot of errors!

Solar · Post by **Solar** » Tue Nov 21, 2006 9:38 am

João Jerónimo wrote:
Solar wrote:It makes certain things trickier - you need support for PIC (position-independent code); a binary format that has proper relocation tables,
I think that's the classical way to load programs in archs without MMU...

If the architecture doesn't support executing PIC eficiently, then parhaps the best choice is relocation tables...

I would say you need both. PIC allows you to access data relative to the instruction pointer, which is fine for your binary image, but you need relocation tables once you want to access e.g. libraries - which will be at a different offset from your binary at each startup.

No deep thinking there, just a gut opinion.

and ideally allows splitting up a binary into smaller parts so you don't need a chunk of continuous memory as big as your complete binary;
I'm not quite sure there is an executable format designed to do this!

Amiga HUNK format, for one. (Which made using GCC on AmigaOS so hard, since GCC doesn't support that format...) Dunno about others.

In my (purely theoric) opinion, address space sharing makes it much simpler to implement shared libraries than with address space separation...

I wouldn't move a library around once it has been loaded. You will always have some programmers caching a library's address locally, either due to laziness or claims of efficiency. When the library moves, and programs access the old location because they don't go through the proper system services to get the current library address... booom.

Why did I say you need offset tables? Easy: Library versions. If your apps were jumping at library functions directly, you would have to relink all applications using a certain library once you update that library. Functions change length, new functions get inserted between the old ones...

AmigaOS solved it quite smartly: When you request a library from the system via OpenLibrary(), the system would give you the "base pointer" to that library - either the position where it was already loaded previously, or the address it was just loaded to due to your request.

At negative offset from that base pointer was the "jump table". The start address of functionX() was found at -4, start of functionY() was at -8, start of functionZ() at -12 and so on. Those negative offsets never changed; when new functions were added, the jump table was extended "downwards".

And when a new version of that library was compiled, with the start addresses of functions changing all over the place, you wrote the new start addresses into that jump table.

Applications using that library would translate a call to functionX() to "a call to the function whose address could be found at offset -4 from the library base".

Note how this mechanism allows for old (compatible) and new (improved) versions of a functionX() co-existing in the same library binary - you pass the library version you'd like to OpenLibrary() (or "0" for "latest"), and get a different library base pointer if you are "v2 aware"...

Candy · Post by **Candy** » Tue Nov 21, 2006 10:42 am

Solar wrote:
and ideally allows splitting up a binary into smaller parts so you don't need a chunk of continuous memory as big as your complete binary;
I'm not quite sure there is an executable format designed to do this!
Amiga HUNK format, for one. (Which made using GCC on AmigaOS so hard, since GCC doesn't support that format...) Dunno about others.

You can make that in ELF, but it isn't designed for it and doesn't have special overlay notation or anything like it. There are examples abound for it though.

At negative offset from that base pointer was the "jump table". The start address of functionX() was found at -4, start of functionY() was at -8, start of functionZ() at -12 and so on. Those negative offsets never changed; when new functions were added, the jump table was extended "downwards".

What's the significance of the negative base pointer?