Page 3 of 17

Re:Where do i start??

Posted: Wed Oct 09, 2002 9:29 am
by Tom
You can use my bootsector, just put it on a floppy using partcopy.

to use part copy, download it from john fine's home page ( google type john fine's home page )

save and extract partcopy to C:\
open a dos command window ( in XP ) and type cd C:\

press enter.

type partcopy and then my boot-sectors's pre-compiled boot.bin

e.g ( i think this is the right command )
partcopy boot.bin A:\
<enter>

Re:Where do i start??

Posted: Thu Oct 10, 2002 3:49 am
by grey wolf
i noticed some of the questions weren't answered in a way that satisfied me, so i'll throw in my own 3 or 4 cents.

there are three distributions of GCC you can get for windows:
http://www.delorie.com/djgpp/ - DJGPP, the most common version. it comes with everything you need to do 32-bit p-mode development; DOS or Win32.
http://mingw.sf.net/ - MinGW32, a "minimum" compiler for win32. it includes only the bare necessities to write software, so i like it best for developing my kernel.
http://www.cygwin.com/ - CygWin, a full POSIX layer for win32 applications, intended for porting unix apps to windows.

for assemblers, there are really 2 free options:
http://nasm.sf.net/ - NASM, a free Intel-Syntax assembler. a lot of people like it since it uses the Intel Syntax, and bane GAS (like i do).
GAS comes with all forms of GCC. it is a mainly 32-bit compiler that uses the AT&T syntax by default. however, it can use Intel syntax. append this line to the beginning of your assembly file and you can use the more familiar Intel syntax:

Code: Select all

.intel_syntax no_prefix
your boot sector should begin in 16-bit real-mode assembler. 32-bit code at this point may cause problems. you can jump to and begin using 32-bit code if you need to.

your boot sector must then call a bootloader (stored on disk) or go on and call the kernel. the bootloader or the kernel is the best place to switch to p-mode (since you don't need BIOS ints as bad now), then jump to the main function. to do this, create an asm file that switches to pmode, kills A20, then jumps to your kernel's main function, and link that as the FIRST file in your kernel (important). then your kernel can do the fun stuff.

that's where i've begun, and i'm still working (hard) on the boot sector.

Re:Where do i start??

Posted: Sun Oct 13, 2002 12:25 am
by Berserk
Ohhh man, i am prepared to work hard on an OS (so don't think i'm lazy) but Assembler...DA*N! i don't know any of it, (GO C++!!!) i find it bloody confusing and it contains a fair bit of math (not my strong point) and i just DO NOT GET HEX (Hexadecimal) do you know Assembler??

If you do, do you know where i can get a good tutorial, a 'FREE' tutorial, but not with faulty info, this means that everything in it is true (i found a tutorial that was totally fake)

P.S. I got NASM.

And also here is a C++ question (a bit off the topic, but please answer if you can):

I am learning C++, I know quite a lot, but i only know how to make Windows Console apps. Do the functions and information that i learn for Console apps apply to Windows (Window) apps??

Please help.....(and excuse the off topic question, but please if you can answer it...DO SO!!!)

Ciao;)

Re:Where do i start??

Posted: Sun Oct 13, 2002 9:18 am
by Tom
Hex is not hard, just another num system.

0-9 is normal numbers just like the 0, 1, 3.... and A = 10
B = 12
C = 13
D = 14
E = 15
F = 16

you might think that F = 15 but... in hex 0 is a number

And...you are right about the C/C++ functions for windows, the work in win32 only.

You'll need to make you own lib, for you OS , or use someone's code.

I'd make a asm boot sector for you, but i'm using Linux to program my OS ( I use win too, but not for OS dev )

Hope that helps

Re:Where do i start??

Posted: Sun Oct 13, 2002 10:41 am
by Whatever5k
A = 10
B = 12
C = 13
D = 14
E = 15
F = 16
Not right, I think you had a weak hand, there ;)
0
1
...
...
9
A = 10
B = 11
C = 12
D = 13
E = 14
F = 15

Total: 16 numbers

Re:Where do i start??

Posted: Sun Oct 13, 2002 11:28 am
by Schol-R-LEA
Nope, I'm afraid 0x0F does equal 15. You made a mistake at "b= 12"; you must have skipped 11 by mistake. Here's a longer version of that table, with octal and binary conversions for comparison:

[tt]
Conversions
-------------
Decimal Hexadecimal Octal Binary
0 0 0 0
1 1 1 1
2 2 2 10
3 3 3 11

4 4 4 100
5 5 5 101
6 6 6 110
7 7 7 111

8 8 10 1000
9 9 11 1001
10 A 12 1010
11 B 13 1011

12 C 14 1100
13 D 15 1101
14 E 16 1110
15 F 17 1111

16 10 20 10000
17 11 21 10001
18 12 22 10010
19 13 23 10011

20 14 24 10100
21 15 25 10101
22 16 26 10110
23 17 27 10111

24 18 30 11000
25 19 31 11001
26 1A 32 11010
27 1B 33 11011
28 1C 34 11100

29 1D 35 11101
30 1E 36 11110
31 1F 37 11111
32 20 40 100000
[/tt ]

I know that this is a really long table, and that octal is rarely used these days, but I wanted to make sure that the patterns - the relationships between hex, octal and binary numbers - were clear.

You'll probably note quite a few numerically interesting patterns. The important relationship in programming is that one octal digit represents the same range as three bits, while a hex is equivalent to four bits (one nybble, or half a byte). This latter is particularly useful, as it means that a full byte is exact the range 00 through FF hex (0 through 255 decimal). This is what makes hex nearly-universal in certain usages today - the fact that it can exactly represent any number of whole bytes very compactly, simply by adding two hex digits for each byte.

However, to address Berserker's concern, this is about all the math you actually need for assembly programming; just about everything else is numeric constants which should be given symbolic names anyway, or else are simple calculations which the assembler will make for you anyway, in most cases. Really complex math is almost never needed, and a four-function calculator which can convert between hex, decimal and binary should be sufficient (dc will do this, as will the 'desktop calculator' apps for Windows, Gnome and KDE). The hardest part is that occasionally you'll need to count out the number bytes in an offset (e.g., for retrieving arguments from a C function frame), as it's easy to make fencepost errors. Keep a close eye on that, and you should do OK.

Re:Where do i start??

Posted: Sun Oct 13, 2002 11:39 am
by Schol-R-LEA
As for assembler being confusing, well, that can happen. Like with any new language, getting familiar with it can take time. Is there anything particular that you have trouble with? Perhaps someone can help you with it. Also, I recommend Duntemann's book (again) as the best place to start learning assembly; if you can get a copy, it should clarify a lot.

I will tell you this: the syntax is not nearly so unusual as it seems. Each instruction is really simple (well, most of them are), and each one always has either zero, one or two arguments. The main problem with learning the syntax is the different addressing modes, that is, how tell the assembler to treat the argument. The real confusing parts come later, in trying to understand the memory model, but by the time you're ready to deal with that, you should know quite a bit about assembly overall.

If all else fails, perhaps you might like TERSE, which is abstract assembler which strongly resembles C. I never thought much of it personally, but some people swear by it and I don't know of any grave flaws in it (though I think it tends to mask the real structure of a program as it exists in memory, which a conventional assembly program matches very closely). I suggest that you take the puffery on the TERSE website with a certain quantity of sodium chloride; similar tools do exist and which, if any of them appeal to you is your call.

As far as all the complaining about AT&T syntax that some of the others have been doing - well, I find Intel syntax more familiar, and use NASM myself, but I still recommend learning how to work with AT&T assemblers like gas. It is the standard on nearly every platform except the PC, and is found with nearly every version of Unix or it's descendants. If you ever need to write assembly code on something other than a PC, that's what you'll need to deal with. Also, if you have to use inline code in gcc, that's what you'll need to use, too. Preferences are one thing, but it doesn't hurt to be familiar with the other alternatives.

Re:Where do i start??

Posted: Sun Oct 13, 2002 3:56 pm
by Tom
I DO know that F = 15, just did a typo

Re:Where do i start??

Posted: Sun Oct 13, 2002 8:41 pm
by Tom
ok, mabe it wasn't a typo ( half asleep while typeing the hex things ;) )

ty for correcting ;D

Re:Where do i start??

Posted: Mon Oct 14, 2002 12:34 am
by Pype.Clicker
Something that might seems confusing when starting with Assembler is the fact that you have to handle a few things that are automatic in C/pascal/C++ ... programming:
  • Think about your computer as a system that has a given state (the registers) on which instructions will apply. in HLL, you just type "y=x*x - 2*x + 1" and the compiler will hide almost everything but the result of your computation that is stored in "Y".
    In asm, you'll have to go through a few dozen of steps that gently leads to that final state, like
    "1. load X into working register A
    2. substract 2 to the value of A (internally) and store it into W.R. B
    3. multiply A with B and keep the result in A
    4. store the result in memory at location Y.
    "

    or

    Code: Select all

    mov eax,[X]
    lea ebx,[eax-2]
    imul eax, ebx
    inc eax
    mov [y],eax
    
    of course, before writing this, you need to know a few things about what your CPU is able to do ...
  • remember that in low-level programming you also have the task to tell the assembler what memory will be used for variables storage. This means that the upper code won't work if you don't
    have something like

    Code: Select all

    X: resd 1
    Y: resd 1
    
    in your code. And this is only for global variables. Such variable are the easiest to use (if you want local variables, then you'll have to allocate them by hand on the stack or in registers) ...
  • it is not recommended to start coding from a white screen in C/pascal: you should better have though a bit at your solution before implementing it. In assembler, this is *mandatory*: you absolutely *can't* just sit down, open a notepad and write down code if you want to perform a slightly complicated function because there are many things to be taken in account. i *strongly* suggest you first draw a structurogram and an organigram of your solution, identify what datas are to be kept (i.e. what's your program state), etc. before you type a "nop" ...

Re:Where do i start??

Posted: Wed Oct 16, 2002 5:50 am
by Berserk
Thank you soooo much,

i read a bit about hex in my C++ programing book(s)
Explains it quite well, but your info helped aswell, as for octal - there is no way i am even going to try and understand it (UNLESS """"""**__COMPLETELY__**"""""
i repeat unless COMPLETELY! necessary) and for binary, well - i get some of it, i'll understand more as time goes by!

For assembler, show me the basic layout of a program, like when you start in C++, something like this:

#include <iostream.h>

int main(int arg, char* args[])
{
cout << "Hello World" << endl;
return 0;
}

if it is not too much trouble, but could you write the source code for a simple assembler program, so i can start to learn other stuff (BUILD ON THIS FRAME OF A PROGRAM) improve my skills. You see - I NEED something to start with, if you get what i'm saying ;)

please help.

ciao.

Re:Where do i start??

Posted: Wed Oct 16, 2002 8:52 pm
by Schol-R-LEA
[attachment deleted by admin]

Re:Where do i start??

Posted: Wed Oct 16, 2002 9:12 pm
by Schol-R-LEA
[attachment deleted by admin]

Re:Where do i start??

Posted: Wed Oct 16, 2002 9:54 pm
by Schol-R-LEA
I forgot to discuss the jumps. stack, function calls and interrupts in the last message, so here it is:
  • I overlooked a few registers earlier, including one very important one: the instruction pointer, or IP. IP holds the address of the current opcode that is to be performed. Every time the system finishes an operation, it fetches the instruction pointed to by IP, and increments IP by the size of the opcode; if the opcode expects an argument, it then fetches the next address as the argument, and increments IP again by the size of the argument(s).

    This means that flow control is nothing more than resetting the IP to a new address. The instruction for this, JMP foo, is the equivalent of a GOTO; IP is set to whatever the address of foo is, and the program restarts from there. The conditional jumps (JC for Jump on Carry, JNZ for Jump when Not Zero, JLE for Jump when Less than or Equal, etc.) reset it depending on the value of specific flag bits in the FLAGS register - such as Carry, Zero, Overflow, etc., which are set or cleared by different operations (i.e., "SUB Foo, Foo" sets the Zero flag). Fortunately, most of these are fairly easy to understand once you get the basic concept.
  • Another register I forgot to mention, SP, has a different special purpose: it points to the top of the stack, a data structure that the CPU has built in support for. The hardware stack is nothing more than a space in memory (defined by each program, or by the operating system, depending) and a pointer to a particular address in it(the SP). The SP is manipulated by two instructions, PUSH and POP (there are other instructions that affect it, but these are the two basic ones, and all the others are special cases of these). When a 'PUSH foo' instruction is performed, it increments the SP by the size of foo, and copies the value of foo to the address that SP now points to. The "POP foo" instruction does the reverse - it dereferences SP and copies the value into foo and decrements SP by the size of foo.
  • One of the special cases of PUSH/POP is the function call instruction, CALL, and the function return, RET. "CALL foo" pushes the address of the instruction after it, then jumps to address foo, while RET pops an address off the stack and jumps to it. This allows the functions to return to an caller anywhere, or even call themselves recursively.
  • Interrupts are events which cause the CPU to stop what it is doing, look up the program (called an interrupt handler) that is called when that specific interrupt happens, run it, and then return to where it was when the interrupt occurred. In real mode, each interrupt has a number, 00h through FFh, and the CPU has a special lookup table called the Interrupt Vector Table (IVT) which has the addresses of all the interrupts. This table is partially initialized at bootup by the BIOS, but the operating system usually modifies it later to add additional facilities. Interrupts have a special version of RET, IRET, for returning to the original program state.

    There are two types of interrupts, hardware interrupts and software interrupts. Hardware interrupts are used to signal events occurring outside of the CPU (e.g., a key press on the keyboard). Software interrupts are mostly used to triggerl BIOS routines or system calls; because these do not have fixed address, the BIOS and the OS use the interrupts as a 'handle' so that the user programs always can call them from the same interrupt without knowing where they actually are in memory. The BIOS, and MS-DOS, often use one interrupt to vector several different functions. Which function is actually called is usually determined by the arguments passed in the AH and AL registers.

    HTH. Comments and corrections welcome, as always.

Re:Where do i start??

Posted: Wed Oct 16, 2002 10:21 pm
by Schol-R-LEA
Oh, hell.....

I forgot the most important principle, the one that, if you don't get from the start, you won't make any sense out of the rest of this:

In NASM, when an operation has two values, and produces a result, the result is always stored in the first argument.

This means that
[tt]
NASM C Pascal
--------------------------------------------------------------
MOV FOO, BAR FOO = BAR; FOO := BAR;

ADD FOO, BAR FOO += BAR; FOO := FOO + BAR;

SUB FOO, BAR FOO -= BAR; FOO := FOO - BAR;
[/tt]

Note that this is one of the places Intel syntax and AT&T syntax part company, as gas always has the value put into the second argument. The Intel syntax more closely matches what the opcodes are actually like, but AT&T syntax is the standard for most other computer systems. Which to use is a matter of preference, but you'd be better off getting used to the Intel syntax as it is what is most common on this message board. Knowing both is an even better idea, but trying to learn both at once will just confuse you :)