Why code and data segment starts from same address

0xnlbs · Post by **0xnlbs** » Tue Apr 22, 2014 12:02 am

I've checked a lot of guides on both bootloaders and kernel. everywhere They create two GDT's one for code and another for data segment. and sets their address in cs and ds registers. But both these structures specify same base address and same limit. why ? If they are supposed to be same why there exists two different address registers ?

alexfru · Post by **alexfru** » Tue Apr 22, 2014 12:22 am

You must have at least 2 distinct segment descriptors in protected mode because code and data segments are described differently in those descriptors and you can't use a code segment descriptor for data or a data segment descriptor for code (generally, there are a few exceptions). So, you create 2 descriptors and set CS to point to the one designated for code and set DS, ES and SS to point to the other one, designated for data.

As for why they usually have the same base address (often 0) and limit (often equivalent of 4G-1), it simply means that they don't want to mess with segmentation, they just want to use all memory however they please and locate code and data anywhere in the entire address space.

Combuster · Post by **Combuster** » Tue Apr 22, 2014 12:37 am

If they are supposed to be same

In principle, they are not. It is simply a very common special case - and even modern OSes while using it as a basis, don't strictly do it that way.

One of the fundamental security principles says that something should not be both writeable and executable. In GDT design you see that as having separate entries, code is always executable, never writeable, and possibly readable. Data is always readable, never executable, and possibly writeable.

Then it so happens that x86 has two forms of protection: segmentation and paging. People disliked segmentation so they mostly went for paging. But since segmentation is always enabled they wanted to make sure it didn't get in the way, so what people typically do is create an entry that maps whatever address comes in to the same address going out so that the only visible effect is paging. Then they realize they can't have both executable and writeable permissions with just one entry, so they make two, stick one in CS and the other in the other segment registers.

0xnlbs · Post by **0xnlbs** » Tue Apr 22, 2014 12:51 am

still not very clear to me. What will happen If I put two differrent base address for code segment and data segment ? e.g. If I don't overlap ? I understand I may not be able to use 4GB virtual memory.
The question sounds very noob but I should ask: the executables have their code and data in same file. and whil executing instructions doesn't get copied anywhere in diiferent segments. So why two segments ? cpu will keep reading the instructions sequentially. what the problem in it ?

Combuster · Post by **Combuster** » Tue Apr 22, 2014 1:23 am

the executables have their code and data in same file.

And this is relevant how? Do you want to have to download several .exe files to be able to run a program?

executing instructions doesn't get copied

You have to copy the instructions from storage to RAM before you can execute them.

And because you are making hardly any sense now, let's check on the basics:
- what registers and table entries are used to determine what instruction is executed next? Describe how.
- what registers and table entries are used to actually execute an instruction? Describe how this works for "push [eax]"

OSDev.org

Why code and data segment starts from same address

Why code and data segment starts from same address

Re: Why code and data segment starts from same address

Re: Why code and data segment starts from same address

Re: Why code and data segment starts from same address

Re: Why code and data segment starts from same address