It’s a stack-based interpreter, and is intended as a (very simple) example of the kind of bytecode interpreter seen in dynamic languages such as Python, Ruby etc.
For the sake of simplicity, our toy virtual machine is very limited:
- The only data type is int
- It can only work on one function at a time (so that the only function call that can be made is to recurse).
- Functions can only take one parameter.
- Functions have a stack of int values.
- We’ll implement function call within the interpreter by calling a function in our implementation, rather than implementing our own frame stack.
- The parser is only good enough to get the examples to work.
Naturally, a real interpreter would be much more complicated that this.
The following operations are supported:
Operation | Meaning | Old Stack | New Stack |
---|---|---|---|
DUP | Duplicate top of stack. | [..., x] | [..., x, x] |
ROT | Swap top two elements of stack. | [..., x, y] | [..., y, x] |
BINARY_ADD | Add the top two elements on the stack. | [..., x, y] | [..., (x+y)] |
BINARY_SUBTRACT | Likewise, but subtract. | [..., x, y] | [..., (x-y)] |
BINARY_MULT | Likewise, but multiply. | [..., x, y] | [..., (x*y)] |
BINARY_COMPARE_LT | Compare the top two elements on the stack and push a nonzero/zero if (x<y). | [..., x, y] | [..., (x<y)] |
RECURSE | Recurse, passing the top of the stack, and popping the result. | [..., x] | [..., fn(x)] |
RETURN | Return the top of the stack. | [x] | [] |
PUSH_CONST arg | Push an int const. | [...] | [..., arg] |
JUMP_ABS_IF_TRUE arg | Pop; if top of stack was
nonzero, jump to
arg . | [..., x] | [...] |
Programs can be interpreted, disassembled, and compiled to machine code.
The interpreter reads .toy
scripts. Here’s what a simple recursive
factorial program looks like, the script factorial.toy
.
The parser ignores lines beginning with a #.
# Simple recursive factorial implementation, roughly equivalent to: # # int factorial (int arg) # { # if (arg < 2) # return arg # return arg * factorial (arg - 1) # } # Initial state: # stack: [arg] # 0: DUP # stack: [arg, arg] # 1: PUSH_CONST 2 # stack: [arg, arg, 2] # 2: BINARY_COMPARE_LT # stack: [arg, (arg < 2)] # 3: JUMP_ABS_IF_TRUE 9 # stack: [arg] # 4: DUP # stack: [arg, arg] # 5: PUSH_CONST 1 # stack: [arg, arg, 1] # 6: BINARY_SUBTRACT # stack: [arg, (arg - 1) # 7: RECURSE # stack: [arg, factorial(arg - 1)] # 8: BINARY_MULT # stack: [arg * factorial(arg - 1)] # 9: RETURN
The interpreter is a simple infinite loop with a big switch
statement
based on what the next opcode is:
static int toyvm_function_interpret (toyvm_function *fn, int arg, FILE *trace) { toyvm_frame frame; #define PUSH(ARG) (toyvm_frame_push (&frame, (ARG))) #define POP(ARG) (toyvm_frame_pop (&frame)) frame.frm_function = fn; frame.frm_pc = 0; frame.frm_cur_depth = 0; PUSH (arg); while (1) { toyvm_op *op; int x, y; assert (frame.frm_pc < fn->fn_num_ops); op = &fn->fn_ops[frame.frm_pc++]; if (trace) { toyvm_frame_dump_stack (&frame, trace); toyvm_function_disassemble_op (fn, op, frame.frm_pc, trace); } switch (op->op_opcode) { /* Ops taking no operand. */ case DUP: x = POP (); PUSH (x); PUSH (x); break; case ROT: y = POP (); x = POP (); PUSH (y); PUSH (x); break; case BINARY_ADD: y = POP (); x = POP (); PUSH (x + y); break; case BINARY_SUBTRACT: y = POP (); x = POP (); PUSH (x - y); break; case BINARY_MULT: y = POP (); x = POP (); PUSH (x * y); break; case BINARY_COMPARE_LT: y = POP (); x = POP (); PUSH (x < y); break; case RECURSE: x = POP (); x = toyvm_function_interpret (fn, x, trace); PUSH (x); break; case RETURN: return POP (); /* Ops taking an operand. */ case PUSH_CONST: PUSH (op->op_operand); break; case JUMP_ABS_IF_TRUE: x = POP (); if (x) frame.frm_pc = op->op_operand; break; default: assert (0); /* unknown opcode */ } /* end of switch on opcode */ } /* end of while loop */ #undef PUSH #undef POP }