I'll address the first question last, as it will help the rest of it all make more sense.
The first thing you have to realize is that a label isn't really like a function name or a goto label in C or Pascal. All that a label is, is a name for a particular address, which the assembler calculates during the first pass of the assembler. As an example, I've assembled the printst.asm file so it generated a list file (using the command prompt "nasm printst.asm -l pritnst.lst"), which shows the details of assembly. Here's part of it where the code generation begins:
Code: Select all
16 entry:
17 00000000 EA[0500]0000 jmp base:start ; make sure that the CS is 0000
18
19 ; the real start of the code
20 start:
21 00000005 8CC8 mov ax, cs
22 00000007 8ED8 mov ds, ax ; set DS == CS
23 00000009 B80090 mov ax, stackseg ; set the stack to an arbitrary free area
if you look at the generated code on line 17, it shows that the actual output for address 0000000 is "EA[0500]0000" which is equivalent to
JMP 0000:0005
(Since the x86 is a little-endian processor, the assembler automatically reverses byte and word orders). Now if you look at address 00000005, it's at program line 21 - which is the first code line immediate after the label 'start'. in other words, the value of 'start' was automatically computed to be 0005, and that was what was the assemlber put in it's place.
When you use a D[B|W|D] or RESP[B|W|D] directive, the 'variable names' are actually labels, no different from the ones in the code; indeed, you can have a DB statement without any label at all (all that RESB and its relatives do is set aside a certain number of memory locations; DB and so forth are the same, except that they also initialize the memory to whatever you happen to put in it. All the label does is give a name to the firsdt of those location's address. The two are completely seperate, even though it is logically a single unit.
BTW, these should not be confused with equates, which are names for the specific
values which they are assigned. The names created with the EQU directive is not labels, and don't allocate memory. Tha same goes for macro and struc definitions, as well (in fact, in NASM both equates and strucs are special cases of macros). Don't worry to much about this point for now; what you already need to know is confusing enough. ::)
The point I'm trying to make is that there is no direct connection between a label and how it is used. Any label can be used anywhere an address can go, or rather, anywhere a 16-bit (or 32-bit, in p-mode or unreal mode) immediate value can go. The assembler doesn't try to warn you or prevent you from doing that, or otherwise try to save you from yourself. The assembler assumes that if you, want to multiply a 16-bit address by the first two bytes of a string variable, then well your the human being, by gosh, and who is it to stand in your way? (this is what's called, "giving you enough rope to hang yourself with."
) It usually only stops if there is something it simply cannot make sense of, such trying to mov a 16-bit register vale into an 8-bit one. It will often warn you of really foolish things, and you can set option switches to have ot give you more or fewer warnings, but it won't give an error for anything that it can atually assemble.
One other thing you should understand is indirect addressing, which is more or less like pointer dereferencing. When an argument in NASM has square brackets around it, then the argument is treated as an address which is used to look up a value. So, for one example,
mov al, [es:di]
means "get the value at the address held in ES:DI and put it in AL". Keeping track of which values to use as immediate arguments and which to use as reference arguments can be a real pain.