6502 FORTH, Part 2: NEXT, EXIT, and DOLIST
Published:
In my previous post, I set up my development environment for creating a Forth interpreter from scratch. The next step is to create the foundational Forth operators next
, exit
, and dolist
.
A Quick Background on Forth
Forth is an entirely stack-oriented language that uses functions called words. Each word executes a distinct set of instructions, and new words can be defined on the fly by programmers at runtime, either in assembly or in Forth (depending on the implementation). All data live in a data stack, and words operate by pushing or pulling values from this stack. Because of this, words operate in reverse Polish notation. As a quick example, to multiply the numbers 10 and 25 together, the internal code does the following:
#> (stack) ->
#> 25 (stack) -> 25
#> 10 (stack) -> 10 -> 25
#> * ...
* stack pops two values *
* stack multiplies values *
* stack pushes result *
ok. (stack) -> 250
#> CR . (stack) -> 250 -> 250
250 (stack) -> 250
We could have written this as a single line with 25 10 * CR .
. The last two commands here, CR .
, start output on a new line and print out the top value of the stack (resp.).
Internally, Forth keeps a dictionary of all the words it recognizes as a linked list, along with pointers to where the code for that routine is stores. When a word is given as input, the interpreter searches the dictionary for that label, then executes the associated code. This linked list is also implemented as a stack, so most recently added words are searched first. As a result, words can be easily redefined at runtime if desired.
The interpreter depends on a small set of core words to make all this execution happen, the most important of which are next
, exit
, and dolist
. dolist
begins the execution of a compiled word, next
is responsible for executing underlying instructions sequentially within a single word, and then exit
returns back to the previous execution context. For any word, the execution proceeds as follows:
(word called)
DOLIST -> NEXT -> EXIT -> NEXT -> ...
next
may spawn child processes that call dolist
as well. This process continues until we’re all out of instructions to execute, at which point the program terminates successfully and prints ok.
Since execution of any word depends on these three words, I’m going to implement them first.
Previous Definitions:
DOLIST
Every time dolist
is called, it has to execute the following instructions:
- Store the instruction pointer on the return stack pointer
- Set the instruction pointer to the new word to be executed
- Call
next
to start executing the word
These are all little-endian 16 bit addresses, so we have to store each byte separately.
NEXT
next
is called immediately after dolist
and exit
.
This .()
syntax was a little new for me coming from vasm
–the parentheses define a local scoping, so we can reuse the continue
label and the program will jump to the label defined in the most recent scope.
EXIT
The job of exit
is to undo all the intermediate pointer manipulation done by next
and dolist
in order to return the execution context to wherever it was when dolist
was initially called. This takes the form of the following steps:
- Remove the last address from the return stack
- Reset the instruction pointer to the value it previously was at
- Execute the next instruction
Next Steps
I’m still a little ways off from being able to test this even on the simulated 65c02, but it is progress! Next steps will include adding the minimal amount of features necessary to be able to test that this part of the code works correctly.