System Architecture

There are many options available for modern CPU architecture designers. Many of those options are not readily applicable to this design because they often rely on features that are not present.

For example common RISC designs rely on a very fast system clock. The system timing in this environment is fixed and comparatively extremely slow (100Hz). Although there is no fixed computational speed specification in the design requirements, getting as fast a system as possible is always a desirable goal.

On the other hand, common CISC systems require a bigger instruction code field and rely on additional hardware. Since this design uses a fixed 24-bit instruction word size, the instruction set may not be very large. Particular attention must be paid to the instruction set design with the goal of keeping all instructions plus any immediate parameters accessed in a single instruction fetch cycle.

=Design Model= The basic design model for this project uses a modified Harvard - Von Newmann memory architecture with 64K cells each for data and instructions. It uses a 16 bit wide, 0-operand, load/store data processing path with a 256 cell deep data stack closely bound to a fast 16 bit ALU. The 24 bit wide instruction path uses a distributed microcoded sequencer and includes a 256 cell deep return stack which may be used for subroutine calls and other system needs. A 16 cell, 16 bit wide general purpose register bank is included and any cell may be used as a pointer into data RAM.

The instruction set includes commands to quickly move data between data RAM, the register bank and the top 2 locations on the data stack using at least immediate, direct and indexed memory access modes. Indexed, offset addressing modes are also desired. The TOS (accumulator) is typically used as the memory pointer and will have a very fast access time. Direct hardware support may be provided for the TOS to actually point to a pointer. Additional hardware may be included for automatic INCrement of the address register into data RAM.

These are the basic requirements for this design. Additional functions may be included as long as they do not violate the basic requirements.

=Functional Blocks=

Data Memory
This design implements a Harvard style memory architecture. The instruction memory is physically and logically separate from the data memory. That along with other hardware support and a well designed instruction set will allow for a more paralleled processing stream. Furthermore, the memories used in the several stacks, registers, etc. are also separate and dedicated memory blocks.

User I/O
=Design Issues=

Instruction Access
The simplest system has but a single, non-branching process thread, so it just executes a linear stream of instructions and stops when that's done. Such a system is hardly even heard of today and would strain the modern definition of a computer.

The most complex system has multiple, concurrent, pipelined processing streams (multitasking) along with several processor cores (multiprocessing). This requires significant additional hardware and 'software' to control the various pipelines, pre-fetch of instructions, designation of cores, maintaining multiple return stacks, and more. Such a design is far too complex to implement in this environment. Or, if it were implemented, would run far too slowly.

This design uses a single processor core without instruction pipelining. It has a single return stack in addition to several general-use registers which may be used as instruction address base (or offset) references. This allows for common branching, including subroutine calls. Task switching (multitasking) is also possible with this design but will likely never be utilized.

Stacks
This design includes two hardware stacks - a "Return Stack" for storage of return addresses needed during subroutine execution and a "Data Stack" for storage of parameters needed for and between processes. Both stacks are separate from main memory and are tightly linked to the CPU core. Machine instructions are designed to utilize these stacks efficiently.

These stacks use a pointer which is readable by the software but is not writable. The contents of the Return Stack are not directly accessible by the software but machine instructions are available to swap items between the two stacks. The Data Stack is tightly linked to the ALU since the ALU usually uses the top two items on this stack as its inputs. This stack may be directly read/written by the software. The top 3 items on the stack may be accessed directly without needing stack manipulation instructions. Variations of the machine instructions provide this flexibility.

Data Access
All data access will be performed via the Data Stack. This greatly simplifies the instruction set and the hardware. It gives a data flow similar to accumulator-based models. The Top Of the data Stack (TOS) may be thought of as similar to the ACCumulator register. ALl computations are done with the TOS as an operand. The hardware has the ability to access the stack with PUSHes and POPs as well as direct READs and WRITEs (register transfers) along with direct INC and DEC.

To assist in data stack manipulation, it is often helpful to save the TOS in a temporary location. In many stack machines the Return Stack is used. Doing so is valid as long as the item is removed from the RStk promptly and properly. This design allows for quick transfer of the TOS (which is a register) to the Return Stack or any general purpose register, which will help support smooth code.