Single address space in Long mode
Posted: Wed Jan 06, 2010 6:46 am
Since Long mode requires Paging enabled, I guess I will have to Identity map the whole system memory. What do you think of it? Good of bad idea?
The Place to Start for Operating System Developers
http://f.osdev.org/
That's what I mean! LOL. In the first post, I said "the whole *system* memory", not the whole range of virtual memory.MessiahAndrw wrote:All you have to do is identity map the amount of memory you have in the system (to avoid memory mapping the entire range and wasting space with unnecessary entries).
Code: Select all
As a result, SYSCALL and SYSRET can take fewer than
one-fourth the number of internal clock cycles to complete than the legacy CALL and RET
instructions. SYSCALL and SYSRET are particularly well-suited for use in 64-bit mode, which
requires implementation of a paged, flat-memory model.
I'd assume MessiahAndrew meant "contiguously mapped" (e.g. RAM from 0x000000 to EBDA mapped to virtual addresses 0x000000 to "x", RAM from 0x00100000 to the first hole mapped to virtual addresses "x" to "x + y", etc).quanganht wrote:That's what I mean! LOL. In the first post, I said "the whole *system* memory", not the whole range of virtual memory.MessiahAndrw wrote:All you have to do is identity map the amount of memory you have in the system (to avoid memory mapping the entire range and wasting space with unnecessary entries).
Yes. The problem is getting multiple processes to share an address space (without segmentation), which means using position independent code. If all RAM is contiguous in the virtual address space then there's also virtual address space fragmentation issues and problems implementing certain optimisations (swap space, memory mapped files, "copy on write", etc); and no easy way to implement protection/isolation (without something complex like software isolation; any process can trash any other process, and there's no way for a process to prevent other processes from accessing sensitive data like the user's passwords, etc).quanganht wrote:So, any pointer made by any process will be globally correct, right?
quanganht wrote:I recently found an interesting point in AMD64 Manual - Vol.2:Code: Select all
As a result, SYSCALL and SYSRET can take fewer than one-fourth the number of internal clock cycles to complete than the legacy CALL and RET instructions. SYSCALL and SYSRET are particularly well-suited for use in 64-bit mode, which requires implementation of a paged, flat-memory model.
I'd assume it means call gates..Owen wrote:I presume it means ones through call gates?
Yes, and no.quanganht wrote:Are they really that fast? The manual also mentioned about "paged, flat memory model" in 64-bit.
Well, I agree about the position independent code thing, but for now, it shouldn't be a big problem. Modern CPUs are said to support it, and make it run just as if they are position dependent (in term of performance).Brendan wrote:If all RAM is contiguous in the virtual address space then there's also virtual address space fragmentation issues and problems implementing certain optimisations (swap space, memory mapped files, "copy on write", etc); and no easy way to implement protection/isolation
Similar to this http://forum.osdev.org/viewtopic.php?f= ... 93#p170267quanganht wrote:any pointer made by any process will be globally correct
I doubt it. IMO we only have to watch out page tables, and mark pages with permissions corresponding to applications.Brendan wrote:and no easy way to implement protection/isolation
That works perfectly fine for a normal OS, because a normal OS allocates space (and not RAM). For "contiguously mapped RAM" you can't give every process a large slab because you end up wasting heaps of RAM. For example, the computer I'm using now isn't doing too much, but there's about 50 processes running. If you give each process 1 GiB then you'll need to use significant amounts of swap space (but you can't implement swap space for "contiguously mapped RAM" either).quanganht wrote:Fragmentation can be solved by using some kind of block(slab?). So, up on application's request, system allocator will give it a chunk of memory, say 1MB, or even 1GB. Then application is free to do anything with it.
Boring old fashioned OSs (e.g. Unix clones) have been using shared memory (without SAS) for several decades.quanganht wrote:Another optimization that only SAS can have is shared memory. IPC, RPC, shared data is *very* efficient in SAS because data is placed in one place only, then owner can pass it's pointer to anyone
SYSCALL/SYSRET doesn't a require flat paged memory model - for a 32-bit OS it'll work without paging and you can still use a limited amount of segmentation (e.g. for data segments). Long mode requires flat paged memory model (therefore a flat paged memory model is required for SYSCALL/SYSRET in long mode).quanganht wrote:Similar to this http://forum.osdev.org/viewtopic.php?f= ... 93#p170267quanganht wrote:any pointer made by any process will be globally correct
Plus, SYSCALL/SYSRET requires flat, paged memory model. That is doubled speed over call gate/SW interrupt. Well 20 cycles times some million calls is something really different
"Mark pages with permissions corresponding to applications."???quanganht wrote:And about thisI doubt it. IMO we only have to watch out page tables, and mark pages with permissions corresponding to applications.Brendan wrote:and no easy way to implement protection/isolation
Of course, no sane OS will put kernel data on the user stack, and a switch to kernel stack will always be the first thing the SYSCALL handler will do. SYSENTER has an explicit ESP stored in an MSR, but that means the scheduler must update the MSR when switching tasks.Brendan wrote:For a specific example (and a big warning), as far as I can tell it's possible for user level code to set RSP to "kernel space" before doing SYSCALL and tricking the kernel into trashing it's own data. For example, try doing "mov rsp,0xFEDCBA9876543210" or "mov rsp,8" before SYSCALL...
Not necessarily. You can just have it contain a pointer to a valid temporary stack, then update ESP manually. (instead of finding out the value for ESP, then bothering the slow WRMSR with it)jal wrote:but that means the scheduler must update the MSR when switching tasks.