Writing a Interpreter

Programming, for all ages and all languages.
Post Reply
beyondsociety

Writing a Interpreter

Post by beyondsociety »

I am in the process of beginning to write a compiler and someone on this forum suggested creating a byte code interpreter to help with constructing the compiler.

How would I go about writing a simple interpreter?
User avatar
df
Member
Member
Posts: 1076
Joined: Fri Oct 22, 2004 11:00 pm
Contact:

Re:Writing a Interpreter

Post by df »

a nice big switch statement? :)

you need to decide if your going to output pcode or real cpu code.

a register based system is the easiest to write an interp for.

when i get home from work I'll post some info on my interp I have.
-- Stu --
AGI1122

Re:Writing a Interpreter

Post by AGI1122 »

Sarien? Or are you guys talking about something completly different?
Perica
Member
Member
Posts: 454
Joined: Sat Nov 25, 2006 12:50 am

Re:Writing a Interpreter

Post by Perica »

..
Last edited by Perica on Sun Dec 03, 2006 9:12 pm, edited 1 time in total.
Beyond Infinity lazy

Re:Writing a Interpreter

Post by Beyond Infinity lazy »

Nay, Perica, they are talking about compilers and Interpreters of hll's. This can be easily induced from beyondsocietys inquiry.
User avatar
df
Member
Member
Posts: 1076
Joined: Fri Oct 22, 2004 11:00 pm
Contact:

Re:Writing a Interpreter

Post by df »

below is an old core to one of my interpreters.
it was a register based machine.
basically it used a 32bit number

top 8 bits were opcode, next 8 were register 1, then next8 were register 2. the remaing were some status bits.

all opcodes ran on register to register with the exception of the load/store opcode.

it ran 'compiled' scripts in a 64kb data block (all code/text etc must sit inside 64kb. that was my restriction, since this version didnt have a VM doing memory interfacing in it, which my current one has).


anyway, gives you an idea of what code my compiler output. (I cut some commands out, since my message was too long)

Code: Select all

void run_opcode(vContext *vCPU)
{
   UINT32   op;
   UINT32   r1, r2, r3, r4;

   op = (UINT32)( ((UINT8*)vCPU->ptrMem)[ vCPU->cpu.reg[ REG_IP ] ] );

   switch(GET_OPCODE(op))
   {
      case op_NULL:
         vCPU->cpu.reg[ REG_IP ] += sizeof(UINT32);
         break;

      case op_ADD:
         r1 = vCPU->cpu.reg[ GET_REGA(op) ];
         r2 = vCPU->cpu.reg[ GET_REGB(op) ];
         vCPU->cpu.reg[ GET_REGA(op) ] = r1 + r2;
         
         if( r1 > vCPU->cpu.reg[ GET_REGA(op) ])
            vCPU->cpu.reg[ REG_FLAGS ] |= FLAG_OVERFLOW;
         else
            vCPU->cpu.reg[ REG_FLAGS ] &= ~FLAG_OVERFLOW;

         vCPU->cpu.reg[ REG_IP ] += sizeof(UINT32);
         break;

      case op_SUB:
         r1 = vCPU->cpu.reg[ GET_REGA(op) ];
         r2 = vCPU->cpu.reg[ GET_REGB(op) ];
         vCPU->cpu.reg[ GET_REGA(op) ] = r1 - r2;
         
         if( r1 < vCPU->cpu.reg[ GET_REGA(op) ])
            vCPU->cpu.reg[ REG_FLAGS ] |= FLAG_OVERFLOW;
         else
            vCPU->cpu.reg[ REG_FLAGS ] &= ~FLAG_OVERFLOW;

         vCPU->cpu.reg[ REG_IP ] += sizeof(UINT32);
         break;

      case op_MUL:
         r1 = vCPU->cpu.reg[ GET_REGA(op) ];
         r2 = vCPU->cpu.reg[ GET_REGB(op) ];
         vCPU->cpu.reg[ GET_REGA(op) ] = r1 * r2;
         /* compute if overflow */
         break;

      case op_DIV:
         r1 = vCPU->cpu.reg[ GET_REGA(op) ];
         r2 = vCPU->cpu.reg[ GET_REGB(op) ];
         vCPU->cpu.reg[ GET_REGA(op) ] = r1 / r2;
         vCPU->cpu.reg[ GET_REGB(op) ] = r1 % r2;
         /* compute if overflow */

         vCPU->cpu.reg[ REG_IP ] += sizeof(UINT32);
         break;

      case op_CMP:
         if( vCPU->cpu.reg[ GET_REGA(op) ] == vCPU->cpu.reg[ GET_REGB(op) ] )
            vCPU->cpu.reg[ REG_FLAGS ] |= FLAG_EQUAL;
         else
            vCPU->cpu.reg[ REG_FLAGS ] &= ~FLAG_EQUAL;

         vCPU->cpu.reg[ REG_IP ] += sizeof(UINT32);
         break;

      case op_MOV:
         // dont change reg0!
         if( GET_REGA(op) == 0)
            break;

         r1 = vCPU->cpu.reg[ GET_REGA(op) ]; // dest
         r2 = vCPU->cpu.reg[ GET_REGB(op) ]; // source

         vCPU->cpu.reg[ REG_IP ] += sizeof(UINT32);

         // source
         switch( GET_MOD1(op) )
         {
            // mov r1, r2
            case 0:
               r4 = r2;
               break;

            // mov r1, [r4]
            case MOD_MEM:
               r4 = ( ((UINT32*)vCPU->ptrMem)[ r2 ] );
               break;
            
            // mov r1, 0xDEADBEEF
            case MOD_NUM:
               r4 = ( ((UINT32*)vCPU->ptrMem)[ vCPU->cpu.reg[ REG_IP ] ] );
               vCPU->cpu.reg[ REG_IP ] += sizeof(UINT32);
               break;

            // mov r1, [0xDEADBEEF]
            case MOD_MEM+MOD_NUM:
               r4 = ( ((UINT32*)vCPU->ptrMem)[ vCPU->cpu.reg[ REG_IP ] ] );
               vCPU->cpu.reg[ REG_IP ] += sizeof(UINT32);

               r4 = ( ((UINT32*)vCPU->ptrMem)[ r4] );
               break;
         }

         // move r4 into { r1|[r1]|[xx] }


         // destination
         switch( GET_MOD1(op) )
         {
            // mov r1, r4
            case 0:
               vCPU->cpu.reg[ r1 ] = r4;
               break;

            // mov [r1], r4
            case MOD_MEM:
               r3 = ( ((UINT32*)vCPU->ptrMem)[ r1 ] );
               ((UINT32*)vCPU->ptrMem)[ r3 ] = r4;
               break;
            
            // mov 0xDEADBEEF, r4
            // illegal!
            case MOD_NUM:
               //r3 = ( ((UINT32*)vCPU->ptrMem)[ vCPU->cpu.reg[ REG_IP ] ] );
               //vCPU->cpu.reg[ REG_IP ] += sizeof(UINT32);
               // signal illegal!
               break;

            // mov [0xDEADBEEF], r4
            case MOD_MEM+MOD_NUM:
               r3 = ( ((UINT32*)vCPU->ptrMem)[ vCPU->cpu.reg[ REG_IP ] ] );
               vCPU->cpu.reg[ REG_IP ] += sizeof(UINT32);

               ((UINT32*)vCPU->ptrMem)[ r3 ] = r4;
               break;
         }
         break;
         
      case op_JE:
         r1 = vCPU->cpu.reg[ GET_REGA(op) ];
         vCPU->cpu.reg[ REG_IP ] += sizeof(UINT32);
         
         if( (vCPU->cpu.reg[ REG_FLAGS ]&=FLAG_EQUAL)==FLAG_EQUAL )
             vCPU->cpu.reg[ REG_IP ] = r1;         
         break;
   }
}
-- Stu --
beyondsociety

Re:Writing a Interpreter

Post by beyondsociety »

df: what cpu is this snipe of code for? just wondering because it looks familar.

Also, I would like to take a look at all the code for this intepreter you wrote to get a idea of what a intepreter consists of.

Whats the difference between pcode and real cpu code?
User avatar
df
Member
Member
Posts: 1076
Joined: Fri Oct 22, 2004 11:00 pm
Contact:

Re:Writing a Interpreter

Post by df »

well that code is just a virtual cpu i made up. 16 registers. very simple stuff. since its all register to register operands, implementation is really basic.

i guess there isnt a great deal of difference from pcode to real cpu in a lot of ways, the original 'pcode' was for a pascal compiler back in the early 80's.

you could have code in your pcode for list->next, list->prev, or encode really complex stuff into an operand, etc.

i dont know if there is any hard/fast rules for pcode. VB4 and < used to compile to PCODE.

these kinds of 'cpu's are really simple to construct.
-- Stu --
beyondsociety

Re:Writing a Interpreter

Post by beyondsociety »

Do you have an opcode list or instruction set for this made up virtual pc?
User avatar
df
Member
Member
Posts: 1076
Joined: Fri Oct 22, 2004 11:00 pm
Contact:

Re:Writing a Interpreter

Post by df »

yeah I have an opcode list.

Code: Select all

/*

opcodes

00 - null
01 - add
02 - sub
03 - mul
04 - div
05 - cmp
06 - mov
07 - je
08 - xor
09 - and
10 - or
11 - not
12 - neg

16 regs

'' all reg, reg

push =
mov   [r13],r1
sub r13, 4

pop =
add r13, 4
mov r1,[r13]


xxxxxxxx 00zq1111 00zq2222 bbbbbbbb

x = opcode   
1 = reg1
2 = reg2
z = numeric flag
    0 - no num
    1 - 32bit num follows
q = memory flag
   0 - reg
   1 - memory
b = undefined.   

reg0 is always ZERO
reg13 is SIP
reg14 is flags
reg15 is IP

*/
-- Stu --
beyondsociety

Re:Writing a Interpreter

Post by beyondsociety »

How does the code you posted lookup the opcodes?

Do you have to put a list of opcodes into a buffer or table and load it first before I run the get_Opcode function?
User avatar
df
Member
Member
Posts: 1076
Joined: Fri Oct 22, 2004 11:00 pm
Contact:

Re:Writing a Interpreter

Post by df »

what??

i load my 'binary' into memory. setup my cpu registers...
and run it?

i dont get your question. i know opcode 1 is add.. so I run the add function when I get to it...

i allocate say 64kb

Code: Select all

char *x;

x=malloc(1024*64);
read_into_file(x);

// ok, buffer x contains my code..

opcode = x[ register[instruction_pointer] ];

switch(opcode)
{
}
maybe i'm reading your question wrong, I dunno.
-- Stu --
User avatar
df
Member
Member
Posts: 1076
Joined: Fri Oct 22, 2004 11:00 pm
Contact:

Re:Writing a Interpreter

Post by df »

I uploaded all the code for that interpreter I had in my examples above.

There is probably lots of bugs in it, but its fairly simple.
unrar it into a directory and run virt.exe it will load and run x1.bin (it being a small test file).

[ftp=ftp://ftp.mega-tokyo.com/pub/my_stuff/temp/int.rar]ftp://ftp.mega-tokyo.com/pub/my_stuff/temp/int.rar[/ftp]

its designed to work, regardless of the endianness of its host cpu.

I compiled it under VC6, but there is nothing fancy, so should compile under much of any 32bit compiler...
-- Stu --
Post Reply