A BASIC token is a single-byte representation of a BASIC keyword: Whenever the user edits or creates a BASIC line, any keywords are replaced by their respective token, and conversely; when the user LISTs the BASIC program, the tokens are displayed as the keywords they represent, in "plain text". Thus, the BASIC programmer never needs to "worry" about these codes.
BASIC tokens are not to be confused with BASIC keyword abbreviations.
Tokens are distinguished from "plain" PETSCII characters by the fact that token codes are always greater than or equal to 128/$80; i.e. the most significant bit in a byte representing a token is always set. Commodore BASIC V2 has 68 keywords and 8 operators, which are assigned token codes in the range from 128/$80 thru 203/$CB. Since the code 255/$FF is reserved for the "pi" character, this leaves 51 unused tokens in the range from 204–254/$CC–FE: Because of the ample use of vectors in the BASIC system, third-party BASIC expansions may use these for additional BASIC commands.
If you type NEW, and then enter the BASIC program line
10 PRINT "HELLO WORLD"
the BASIC program memory (starting from 2049/$0801) will contain:
|2049/$0801||21/$15 8/$08||Pointer to beginning of "next" BASIC line, in low-byte/high-byte order|
|2051/$0803||10/$0A 0/$00||BASIC line number "10", in low-byte/high-byte order|
|2053/$0805||153/$99||The token for the PRINT keyword|
|2054/$0806||32/$20 34/$22||SPACE and quote characters following PRINT|
|2056/$0808 ... 2066/$0812||PETSCII codes for the "hello world" text|
|2067/$0813||34/$22||Quote at end of PRINTed text.|
|2068/$0814||0/$00||Zero-byte marking the end of the BASIC line.|
|2069/$0815||0/$00 0/$00||Two zero-bytes in place of the pointer to next BASIC line indicates the end of the program.|
This token system saves space in memory and helps in slightly speeding up execution time, both in the computer and when the program is saved to and retrieved from tape or disk.
Token-to-keyword conversion table
The following table lists all the reserved keywords and symbols, according to their associated token code:
Notice that the tokens break down into four "groups", which mirrors four of the "classes" of keywords/symbols.
- Commands are in the range 128–162/$80–A2
- Various "bywords" that form part of the syntax of the keywords above fall in the 163–169/$A3–$A9 range
- Arithmetic and logic operators have token codes 170–179/$AA–$B3
- Functions are in the range 180–202/$B4–CA
GO (token code 203/$CB) breaks this "rule"; it is actually handled as a command, despite it's location "behind" the functions in the token system.