Peter, you might want to try and find a copy of
Assembly Language Step by Step by Jeff Duntemann; the second edition is probably the best starting book on both DOS and Linux assembly programming I know of. Among other things it actually uses NASM for all it's examples, which is unique in published books on the subject as far as I know. The explanation of segment registers and segment offsets is the best I've seen to date. The only thing I disagree with him on is the time he spent explaining NASM-IDE, but that's only because a better (though slower) editor, NASM-Edit, is now available. In any cae, since he explains and uses EMACS in the Linux section, he'd have been better of simply using the DOS or Windows version of EMACS as well for consistency across the board, IMAO.
More immediately, you may want to look at a
posting about segments and segment registers that I wrote on another forum a while back. It's not great, and the formatting in particular is rather confusing, but I think it might help. Duntemann's book was a major source for it, of course.
Let me recap one more time, to make it clearer (I hope): when they first designed the 8086, the goal was to allow a total address space of 1M, while still letting the programmer ignore the higher bits and just use the 16 bits in the given segment for NEAR addressing. This was meant to make it easier for programmers to port code from the older 8080 CPUs, which had only a 16-bit adress space total. Each segment would be equal to an 8080 address space, and the programmer only needed to know about the 16-bit addresses; data, stack and code could all be treated as one 64K area. This is what was called the 'tiny' memory model. However, if a programmer needed more memory, he could use the segments to add more space, still without worrying about the segment registers directly, by giving giving fixed segments for data (DS) and stack (SS) spaces; data references would always be relative to the DS register, while stack references would always be relative to the SS register. In fact, this was even true in the tiny model, but there, the CS, DS, and SS registers were set to the same value, so that the segments overlapped each other exactly. They also wanted it possible to have segments overlap each other less exactly, for various reasons relating to system programming and memory management.
They still needed a way to get the addresses to equal 20 bits without being more than that by a substantial amount (otherwise they would be wasting addresses). The solution was to add the two values together, but with the value of the segment register shifted by 4 bits to the left first. In hex, this is the same as multiplying the segment register by 0x10 (or more simply, adding a zero after the former units place) and adding the offset to it. Thus, the address 4AF1:3114 becomes
4AF10 Segment register <--- note the extra zero
+3114 Segment offset
--------
4E024 Absolute address
One consequence of this is that segments can overlap each other; or to put in other words, the same absolute address may have several segment:offset addresses. For example, 4E024 can be expressed not only as 4AF1:3114, but also as 4E02:0004, 4E00:0024, 4000:E024, or even 3F75:E8D4, just to give one example. The major limitation is the size of the segment and offset registers: in real mode, they are never more than 16-bits each, hence the 64K segment limit and the 1M total address space. In 32-bit protected mode, the offsets (but not the segment registers) are extended to 32-bits, for a maximum segment size of 4G, but the segments are still there. Also, the segment registers are no longer directly accessible, but are controlled by by the segment descriptors in the Global Descriptor Table and the various Local Descriptor Tables, which can only be modified while the CPU is in the kernel state (CR0).
One thing to remember is that the segment register only defines the begning of a segment, not the segment itself. The segment size is never explicitly defined; rather, it is implicit in the highest offset used by the program. In a program, the instruction pointer is always an offset of the code segment (CS) register; all instruction addresses and labels are relative to the segment value in the CS. Similarly, as stated before, data addresses which are not explicitly referenced with a segment value are always offsets of the data segment (DS) register, and the stack pointer is always an offset of the stack segment (SS) register.
I'm rambling a bit now, so I think I'll quit. I hop that this makes some kind of sense to you, as I know it isn't very clear...