I think some additional explanation may be in order, even though the Wiki page referenced earlier covers most of this.
First, to understand segment:offset addressing, you will need to see the historical context a bit. When Intel designed the original 8086 CPU, they wanted to extend the earlier 8080 design from 8 bits to 16 bits, but since the 8080 already had a 16-bit address space (the IP and SP registers was 16-bit, the others were all 8-bit but the B and C, D and E, and H and L registers could be accessed together to act as three 16-bit registers), they wanted to extend the address space. However, a 32-bit address space was deemed impractical, as it would need double the number of address pins on the cip package, and in any case it was never anticipated that memories larger than 1 MiB would become available so soon afterwards (Intel still saw their CPUs mainly as microcontrollers, and didn't take the home computer market very seriously yet). So they devised a compromise, similar to one used by IBM twelve years earlier on the System/360: they set up the memory addressing as a series of overlapping segments, in which two 16-bit registers were combined to form an address space a little larger than 1 MiB (an effective 20-bit address range - there actually was a small amount of additional addressing capacity, but they only gave the chips 20 address pins, so that was effectively lost).
The memory segmentation works like this: the segment address holds the address in memory where a 64K segment begins. Each segment begins at an offset that is a multiple of 16, so each potential segment can start 16 bytes after the previous segment. So, if I have a segment value of, say, 000A, it would give you a segment that starts at physical address 000A0, while a segment value of 000B would start 16 bytes higher in memory, at 000B0, and end 16 bytes after the end of the first segment.
The offset register holds an address
within a given segment, and if the segment register is constant can be used just as if it were simply an address in a 16-bit address space. To get the physical address from the segment and offset, you would have to add them together with the segment address value multiplied by 16 (that is, shifted by 4 bits to the left), so for the first segment value given above and an offset of 00A4, you would get an address of
Note that the segments overlap; if we had a segment value of 0010 and an offset of 0044, it would map to the same location in physical memory:
This means that you have to be careful when you have segments that overlap.
Now, the IP register is in fact a 16-bit register, which holds an offset. The corresponding segment register is CS, the Code Segment. Thus, changing CD would in effect jump you to the same offset but in a different register, which is why the MOV instruction cannot access CS at all. Only by using a FAR JMP or FAR CALL can you change CS (the conditional jump instructions can only use relative jumps - that is to say, a SHORT Jxx instruction can move the IP 127 bytes up or down, while a NEAR Jxx instruction can move IP by 32,767 up or down).
The SS is, of course, the Stack Segment, and by default the stack grows down. The Stack Pointer, SP, is an offset from SS, so each time a value is pushed onto the stack, SP is decremented by 2 (4 in 32-bit modes on later processor models), while a pop increments it by 2 (or 4). Thus, if you set SP to 0000, then push a value, it will be inserted into memory at SS:FFFE.
The Data Segment (DS) is the default segment for data. In small programs - such as a boot loader - it is often set to map to the same segment as the code segment, as a way of saving memory and simplifying addressing. Whenever you access an memory location by a data instruction, such as
Code: Select all
MOV AX, [8] ; move data from DS:0008 to AX
if there is no explicit data segment, then DS is assumed. The Extra Segment (ES) is used as an additional data segment, but have to explicitly accessed: