Instruction Set

The machine's instruction set is the interface between the physical hardware (CPU core) and the program. It embodies the operations that the core is capable of performing. This article describes the instruction set and how it relates to the operation of the core components in general terms. For the detailed state-by-state description of each instruction see the Instruction Sequences article. For more details on the design and operation of the Instruction Decoder itself, see that article.

There are three basic types of instruction in this machine: -
 * Data Transfer- these instructions move data between data memory, the registers, other internal register, the Data Stack and/or the Return Stack.
 * Stack manipulation - specialized instructions that directly switch around the entries in the data stack. These instructions are not typically found in a machine instruction set but this Design Model will benefit greatly by having them. These are sequences that transfer data so are included here.
 * ALU operations - these instructions are the commands to the ALU. (The instruction set may allow setting different input and/or output locations but generally the TOS and NOS are the inputs and TOS will hold the result.)
 * Program Flow - these instructions change the flow of instructions by manipulating the contents of the instruction pointer and/or the Return Stack. Other instructions may use the return stack as temporary storage but are not included here because they do not affect program flow (they are listed under Data Transfer instructions).

=Instruction List=

ALU instructions
The ALU instruction may be split into two types - 1 or 2 operand. The ALU function code uses 4 bits in the instruction. The 1 or 2 input selects take another 4 or 8 bits.

The ALU function code allows up to 16 instructions. They are listed below.

1-Operand Instructions

 * Increment
 * Decrement
 * Invert
 * Negate
 * L/Shift
 * R/Shift
 * L/Shift-Carry
 * R/Shift-Carry

2-Operand Instructions
-
 * AND
 * OR
 * XOR
 * ADD
 * SUB
 * ADD-Carry
 * SUB-Carry
 * BitTest

Data Movement Instructions
These instructions move parameter values into and out of the Data Stack and other locations, either directly between the relevant "registers" or through data memory. The 'harvard-like' memory architecture also necessitates special instructions for the case where data must be accessed within the instruction space. (General-purpose programs must never use these instructions.)

These may be broken down into 3 types as listed below.

Data Memory
Most data access is done between the data storage memory and the Data Stack and the instruction set provides flexible addressing of the parameter location. [How to implement addressing modes? And which ones do I want? The RAM_address uses 16 bits of the instruction code. The primary code segment uses another 4. Could the addressing mode use some secondary segment bits?]


 * @RAM (RAM_address, INT_register) Takes the data from data RAM at the location specified by RAM_address and latches it into the specified internal register. This is NOT the register bank but any of several internal registers, typically the "T" register.
 * !RAM (RAM_address, INT_register) Takes the data in the specified internal register (typically the "T" register) and writes it to the data RAM at the location specified by RAM_address.

Data Stack Manipulation
Instructions are provided to directly manipulate entries in the Parameter Stack. The minimum primitives are Also included in machine code are:
 * NIP
 * DROP
 * OVER
 * DUP
 * >R
 * R>
 * >A
 * A>
 * SWAP
 * PICK(0-15)
 * >rN
 * rN>

Rather than just have the additional 'A' register, this machine will have 16 general purpose fast registers available, in addition to the physical address latch for memory (register). These instructions are included in the Register Bank section below.

This design includes a fast and temporary (fully programmable) +# in the address generator where the # can be up to 15. The decoder will send the proper actual address to the stack memory. This is used in many of the above instructions and makes the last instruction in the list the most powerful (and easy to abuse) stack manipulation word.

Register Bank
Instructions are provided to directly move data into and out of the register bank. The other location may be another file in the register bank, any appropriate dedicated register, or a location in main memory. Memory locations may be computed using several addressing modes, depending on the instruction. The designated register can be used directly or indirectly, perhaps even double indirect may be programmable as a single instruction sequence.

Instructions take the form of
 * GEN1->[GEN2], meaning get the contents of the RAM location GP_register_2 points to. Use that as the RAM address to store the data that's in GP_register_1

The instructions that take a 16 bit field and a 4 bit field are limited by the instruction code width (24 bits). In order to access the dedicated registers and as many GP registers as possible, the number of GP registers that can be addresses in these instructions is limited to 8. This allows us to address the dedicated registers as well without needing more bits in the instruction.

Return Stack
The only instructions available to the system which directly affect the return stack are the following. The program flow instructions CALL and RET also alter the return stack but this is done implicitly within the instruction.

-
 * 0RP - Resets the Return Stack Pointer to '00'
 * @RP - Puts the value of the Return Stack Pointer into the "T" register.

Program Flow instructions
These instructions provide for control of program flow. They take address data from several possible sources and send it to the Program Counter which will latch it into the Instruction Pointer.


 * NOOP - does nothing


 * GOTO (target address) - unconditional redirect of program flow

Takes the target address and places it into the IP. Starts execution at that address.
 * JUMP (condition, target address) - conditional redirect of program flow

Tests the CC register for the condition specified and if non-zero takes the target address and places it into the IP. Starts execution at that address.
 * CALL (target address) - unconditional subroutine execution

Takes the output of the Program Counter (which should be IP + 1) and pushes it onto the return stack, then places the target address into the IP. The "push" of the return stack increments the return stack pointer. Begins execution at the address in IP.
 * RETURN - returns from subroutine call

Pops the value on the return stack and places it into the IP. The "pop" of the return stack decrements the return stack pointer. Begins execution at the address in IP. - =Notes=


 * One goal of this design is a hardware implementation of the Forth Virtual Machine, the 'inner interpreter'. The threading mechanism implemented here is the 'natural' choice of sub-routine threading. Subroutine calls and returns must therefore be as fast as possible. The Return Stack is tightly linked to the Program Counter so they can swap values very fast.


 * For speed, the stack depth may be limited to 16 'words'. The natural hex-base of the physical system limits the fastest addressing to 4 bits. This is not very deep for many programs. However, due to the fast nature of the 'Memory Bank' in the simulation environment, doubling the depth is no faster than increasing it 16 times! So 256 word deep stacks are as fast as 32 word deep ones.


 * Another goal is to implement enough fast registers to allow for more 'traditional' high level programming styles. The hex-base of the physical system makes for fast exchanges between 2 sets of registers, each 16 or 256 'words' deep. This is more than sufficient for most programming styles.

- =Example stack machine instruction set (F21 processor):=
 * A register (T) acts as the top of the data stack. All data are placed in T; its prior contents are pushed onto S.
 * The ALU acts upon T and S, leaves its result in T and pops S for binary operations (+ -or and).
 * A register (A) is used to address data.
 * A program counter (P) is used to address instructions.
 * The return stack stores subroutine return addresses (and occassional data).
 * A configuration register (C) specifies timing and addressing options.

JMP  unconditional jump           ( 3 types, 10 bit, 14 bit, home page) T0   branch if TOP of stack =0    ( DUP IF ) ( 3 types) C0   branch if CARRY bit not set  ( 3 types) CALL subroutine call              ( 3 types) RET  subroutine return @A+  place memory contents pointed to by register A in TOP, increment A @R+   place memory contents pointed to by register R in TOP, increment R  @A    place memory contents pointed to by register A in TOP of stack !A+  store TOP of stack into memory pointed to by A, increment A !R+   store TOP of stack into memory pointed to by R, increment R !A    store TOP of stack into memory pointed to by A COM   complement TOP of stack AND  AND TOP of stack with NEXT and leave result in TOP -OR  EXCLUSIVE OR TOP of stack with NEXT and leave result in TOP +    ADD TOP of stack to NEXT and leave result in TOP 2*   left shift TOP 2/   right shift TOP +*   ADD TOP of stack to NEXT and leave result in TOP, NEXT unchanged (perform add only if the least signifigant bit of T = 1) A    copy A to TOP of stack A!   move TOP of stack to A DUP   duplicate TOP of stack DROP discard TOP of stack OVER duplicate the second item to the TOP of Data stack PUSH TOP of DATA stack to TOP of RETURN stack POP  TOP of RETURN stack to TOP of DATA stack NOP  No CPU operation - The 27 instruction codes: Code Name Description            As Forth (where A is a variable)
 * 1)     literal, place immediate word into top of stack

00 else unconditional jump                  ELSE 01 T=0  jump if T0-19 zero                  DUP IF  02 call  push P+1 to R, jump                 : 03 C=0  jump if T20 zero                    CARRY? IF 04 05 06 ret   pop P from R                        ; 07 08 @R+   fetch, address in R, increment R    R @ R> 1+ >R 09 @A+  fetch, address in A, increment A    A @ @ 1 A +! 0A #    fetch 20-bit in-line literal        LIT 0B @A   fetch, address in A                 A @ @ 0C !R+  store, address in R, increment R    R ! R> 1+ >R 0D !A+  store, address in A, increment A    A @ ! 1 A +! 0E 0F !A   store, address in A                 A @ ! 10 com  complement T                        -1 XOR 11 2*   shift T, 0 to T0                    2* 12 2/   shift T, T20 to T19                 2/ 13 +*   add S to T if T0 one                DUP 1 AND IF OVER + THEN 14 -or  exclusive-or S to T                 XOR 15 and  and S to T                          AND 16 17 +     add S to T                          + 18 pop  pop R, push into T                  R>  19 A@    push A into T                       A @ 1A dup  push T into T                       DUP 1B over push S into T                       OVER 1C push pop T, push into R                  >R 1D A!   pop T into A                        A ! 1E nop                                       NOP 1F drop pop T                               DROP Forth macros A! @A                                    @ A! !A                                    ! dup dup -or com                          -1 dup dup -or                              0 over com and -or                         OR     A! push A@ pop                            SWAP # (com) push ;                           long_jump

- Often, threaded virtual machines such as implementations of Forth have a simple virtual machine at heart, consisting of three primitives. Those are:

nest, also called docol unnest, or semi_s (s) next

In an indirect-threaded virtual machine, the one given here, the operations are:

next: *ip++ -> w  jump **w++ nest: ip -> *rp++ w -> ip  next unnest: *--rp -> ip  next