OSDev.org

Posted: **Fri May 05, 2017 11:58 pm**

I understand how paging works and what is used for, but I don't really understand what the GDT does. When does the GDT apply? Is it replaced by paging? Do I pick one or implement both?

*EDIT* To anyone from the future, Intel's IA-32 System Programming Guide answered all of my questions and more.

Posted: **Sat May 06, 2017 12:18 am**

Not much learning, huh? What did you read to understand the GDT/LDT and where did you fail to?

Posted: **Sat May 06, 2017 12:46 am**

GDT manages memory segments, another way of memory management.

Segmentation in real mode is a bit different than protected mode. Physical addresses are calculated by "Segment * 0x10 + Offset" in real mode segmentation and segment registers (ds, es, fs, gs, ss, cs) holds the segment parts. In real mode each segment had a limit of 0xFFFF. An example segment:offset is 0xB800:0x0 that resolves to physical address 0xB8000.

Segmentation in protected mode is handled by GDT. The GDT has a list of segments, their base and limit. Segments in protected mode are offsets to them in the GDT. As each GDT entry is 8 bytes long, the offset in GDT is calculated by "index * 8".

A flat model GDT would look like this:

GDT offset 0x0: Base=0x0, Limit=0x0 (Null descriptor)
GDT offset 0x8: Base=0x0, Limit=0xFFFFFFFF, PL=0, EX=1, R=1 (Ring 0 code segment)
GDT offset 0x10: Base=0x0, Limit=0xFFFFFFFF, PL=0, EX=0, RW=1 (Ring 0 data segment)

EX is executable, PL is privilege level, R is readable and RW is both readable and writable. Code segments are read only, thats why I used R instead of RW. Code segments are executable but data segments not.

As both code and data segments' bases are 0x0 and limits are 0xFFFFFFFF, memory is untranslated. Thats why its name is flat model.

The segment and offset of a far jump to address 0x100000 is 0x8:0x100000 in that setup of GDT.

Using paging instead of segmentation is usually a better solution.

Posted: **Sat May 06, 2017 12:47 am**

I've read the OSDev wiki page, the wikipedia page, and this os dev book.

So if I understand correctly, the GDT splits the physical memory into segments which the processor then references through special registers (cs, ds, ss, ...). What I don't understand is how this works with paging. Is the GDT set up before paging is enabled or are they two separate approaches to memory segmentation? If they are used in unison, after paging is enabled aren't the addresses to the different segments incorrect?

Posted: **Sat May 06, 2017 1:58 am**

MuchLearning wrote:I've read the OSDev wiki page, the wikipedia page, and this os dev book.

So if I understand correctly, the GDT splits the physical memory into segments which the processor then references through special registers (cs, ds, ss, ...). What I don't understand is how this works with paging. Is the GDT set up before paging is enabled or are they two separate approaches to memory segmentation? If they are used in unison, after paging is enabled aren't the addresses to the different segments incorrect?

Segmentation is legacy cruft left in for backwards compatibility. Unfortunately you need to set up the GDT, but luckily you can just make all of your segments span the entire address space.

Posted: **Sat May 06, 2017 2:03 am**

MuchLearning wrote:I've read the OSDev wiki page, the wikipedia page, and this os dev book.

So if I understand correctly, the GDT splits the physical memory

Not physical, if page translation is enabled. And it doesn't really split anything.

MuchLearning wrote:into segments which the processor then references through special registers (cs, ds, ss, ...). What I don't understand is how this works with paging. Is the GDT set up before paging is enabled or are they two separate approaches to memory segmentation? If they are used in unison, after paging is enabled aren't the addresses to the different segments incorrect?

; // I'll be talking about the 80386 unless otherwise specified.

If page translation is disabled, a segment is basically a window through which you can access physical memory.
So, for example, the mov eax, [es:ebx] instruction will use the value of the es register (AKA segment selector) as an index into the GDT (or LDT) to specify one of the many windows that the GDT (or LDT) can describe (up to 8192 windows per table AFAIR). So it gets the start address of the window (AKA segment base address) and the window size from the table. It then counts ebx bytes (AKA segment offset) up from the start address and gets the physical address of where to read from. It also checks that the read at this address doesn't go outside of the window.

The windows (er, segments) are completely independent and can start anywhere and their size can be almost anything (see segment size granularity), from 1 byte to 4GB. They can overlap or coincide if desired.

Segmentation is always present. You can't turn it off, but you can disempower it by using segments of the largest size possible and starting at address 0, that is, viewing all of the memory through the windows.

If page translation is enabled, you have the same thing as above, but the address that you get by using the segment selector and segment offset (and going through the GDT/LDT) is not the final physical one. It's a virtual (AKA linear) one, which undergoes conversion into the physical one according to the page tables.

IOW, when page translation is enabled, segmentation provides windowed view of virtual rather than physical memory.

Because of this virtual to physical conversion, at the points where you enable (or disable) page translation, the instruction that enables/disables it needs to have its virtual and physical addresses the same (AKA identity or 1:1 mapping). Otherwise the CPU will go fetching the following instructions from the wrong place. (It's likely a bit more nuanced, I haven't touched this bit for a long time)

When page translation is enabled, for most things segmentation is useless and therefore segments of 4GB size are in use.
However, there are a few common special uses of segmentation even with page translation is on.

One is thread-local storage (TLS).

Another is non-executable memory. You can make your code segment precisely of the code size of your program and eliminate erroneous (or malicious) execution of data. However, there are ways to go around this, namely return-oriented programming (ROP).
Memory can be made non-executable through page tables on modern CPUs as well, but originally only segmentation provided a way for doing this.

Qs?

Posted: **Sat May 06, 2017 2:53 am**

alexfru wrote:Segmentation is always present. You can't turn it off, but you can disempower it by using segments of the largest size possible and starting at address 0, that is, viewing all of the memory through the windows.

So essentially when people suggest doing something like this from the GDT Tutorial

Code: Select all

create_descriptor(0, 0, 0);
    create_descriptor(0, 0x000FFFFF, (GDT_CODE_PL0));
    create_descriptor(0, 0x000FFFFF, (GDT_DATA_PL0));
    create_descriptor(0, 0x000FFFFF, (GDT_CODE_PL3));
    create_descriptor(0, 0x000FFFFF, (GDT_DATA_PL3));

The idea is to basically ignore the segmentation functionality provided by the GDT and instead using paging to implement memory protections?

I also found this website that helped me to understand how GDT and paging work together.

Posted: **Sat May 06, 2017 3:27 am**

MuchLearning wrote: So essentially when people suggest doing something like this from the GDT Tutorial
Code: Select all
create_descriptor(0, 0, 0);
    create_descriptor(0, 0x000FFFFF, (GDT_CODE_PL0));
    create_descriptor(0, 0x000FFFFF, (GDT_DATA_PL0));
    create_descriptor(0, 0x000FFFFF, (GDT_CODE_PL3));
    create_descriptor(0, 0x000FFFFF, (GDT_DATA_PL3));
The idea is to basically ignore the segmentation functionality provided by the GDT and instead using paging to implement memory protections?

Correct, by creating BASE 0, LIMIT 4GiB ring0 and ring3 (for both code and data) segments you've effectively "disabled" segmentation, or rather you can ignore it, people often call it "flat memory model".

Posted: **Sat May 06, 2017 3:31 am**

From Intel's System Programming Guide

3.2.5 Paging and Segmentation
Paging can be used with any of the segmentation models described in Figures 3-2, 3-3, and 3-4. The processor’s paging mechanism divides the linear address space (into which segments are mapped) into pages (as shown in Figure 3-1). These linear-address-space pages are then mapped to pages in the physical address space. The paging mechanism offers several page-level protection facilities that can be used with or instead of the segment protection facilities. For example, it lets read-write protection be enforced on a page-by-page basis. The paging mechanism also provides two-level user-supervisor protection that can also be specified on a page-by-page basis

Figure 3-1 in that guide is particularly helpful for visualizing the relationship between segmentation and paging.

Posted: **Sat May 06, 2017 4:28 am**

MuchLearning wrote:*EDIT* To anyone from the future, Intel's IA-32 System Programming Guide answered all of my questions and more.

If only you could convince people to actually bother to download and read it! And, perhaps, travel to the past and do it yourself before asking here.

Seriously, this stuff has been documented for 30 years now. And the 80386 manual had nice ASCII art diagrams in it. It's not like only PDF files can present stuff graphically.

Posted: **Sat May 06, 2017 5:34 am**

No joke, I'm pretty new and have been searching for good resources. I feel like I've hit a goldmine after discovering this documentation. I'm still here reading it almost 3 hours later.

Posted: **Sat May 06, 2017 6:28 am**

MuchLearning wrote:No joke, I'm pretty new and have been searching for good resources. I feel like I've hit a goldmine after discovering this documentation. I'm still here reading it almost 3 hours later.

Just so you know, AMD has one too and it's good to read that too if you intend to _fully_ support both CPUs. While on the surface they are very similar there are differences also and having something explained by two different documents may sometimes give you better understanding.

You may prefer to read the "original" 386 manual because it's a lot shorter (I think around 300 pages of which a lot can be skipped as it's not OS related but general application, instruction reference, etc) and easier to understand and given backwards compatibility shouldn't give you issues and then progress to the newest document to take care of all the extra stuff there is.

Also note that there's about a trillion versions of these documents so if you are going to read the newer ones make sure to get the newest, once I had trouble finding the SYSCALL instruction in AMD's manual until I realized it was quite old version of the spec so it wasn't there at all..

Finally, there's specs available for almost all of this stuff (GPUs sometimes not) so get used to reading them, the Wiki is there only to give tips and help a bit but you should almost always refer to the specs/docs instead..

OSDev.org

GDT and Paging

GDT and Paging

Re: GDT and Paging

Re: GDT and Paging

Re: GDT and Paging

Re: GDT and Paging

Re: GDT and Paging

Re: GDT and Paging

Re: GDT and Paging

Re: GDT and Paging

Re: GDT and Paging

Re: GDT and Paging

Re: GDT and Paging