BASIC token

From C64-Wiki
Jump to: navigation, search

A BASIC token is a single-byte representation of a BASIC keyword: Whenever the user edits or creates a BASIC line, any keywords are replaced by their respective token, and conversely; when the user LISTs the BASIC program, the tokens are displayed as the keywords they represent, in "plain text". Thus, the BASIC programmer never needs to "worry" about these codes.

BASIC tokens are not to be confused with BASIC keyword abbreviations.

Tokens are distinguished from "plain" PETSCII characters by the fact that token codes are always greater than or equal to 128/$80; i.e. the most significant bit in a byte representing a token is always set. Commodore BASIC V2 has 68 keywords and 8 operators, which are assigned token codes in the range from 128/$80 thru 203/$CB. Since the code 255/$FF is reserved for the "pi" character, this leaves 51 unused tokens in the range from 204–254/$CC–FE: Because of the ample use of vectors in the BASIC system, third-party BASIC expansions may use these for additional BASIC commands.

Example[edit]

If you type NEW, and then enter the BASIC program line

10 PRINT "HELLO WORLD"

the BASIC program memory (starting from 2049/$0801) will contain:

2049/$0801 21/$15 8/$08 Pointer to beginning of "next" BASIC line, in low-byte/high-byte order
2051/$0803 10/$0A 0/$00 BASIC line number "10", in low-byte/high-byte order
2053/$0805 153/$99 The token for the PRINT keyword
2054/$0806 32/$20 34/$22 SPACE and quote characters following PRINT
2056/$0808 ... 2066/$0812 PETSCII codes for the "hello world" text
2067/$0813 34/$22 Quote at end of PRINTed text.
2068/$0814 0/$00 Zero-byte marking the end of the BASIC line.
2069/$0815 0/$00 0/$00 Two zero-bytes in place of the pointer to next BASIC line indicates the end of the program.

This token system saves space in memory and helps in slightly speeding up execution time, both in the computer and when the program is saved to and retrieved from tape or disk.

Token-to-keyword conversion table[edit]

The following table lists all the reserved keywords and symbols, according to their associated token code:

128/$80 END
129/$81 FOR
130/$82 NEXT
131/$83 DATA
132/$84 INPUT#
133/$85 INPUT
134/$86 DIM
135/$87 READ
136/$88 LET
137/$89 GOTO
138/$8A RUN
139/$8B IF
140/$8C RESTORE
141/$8D GOSUB
142/$8E RETURN
143/$8F REM
144/$90 STOP
145/$91 ON
146/$92 WAIT
147/$93 LOAD
148/$94 SAVE
149/$95 VERIFY
150/$96 DEF
151/$97 POKE
152/$98 PRINT#
153/$99 PRINT
154/$9A CONT
155/$9B LIST
156/$9C CLR
157/$9D CMD
158/$9E SYS
159/$9F OPEN
160/$A0 CLOSE
161/$A1 GET
162/$A2 NEW
163/$A3 TAB(
164/$A4 TO
165/$A5 FN
166/$A6 SPC(
167/$A7 THEN
168/$A8 NOT
169/$A9 STEP
170/$AA + (Addition)
171/$AB − (Subtraction)
172/$AC * (Multiplication)
173/$AD / (Division)
174/$AE ^ (Power)
175/$AF AND
176/$B0 OR
177/$B1 > (greater-than operator)
178/$B2 = (equals operator)
179/$B3 < (less-than operator)
180/$B4 SGN
181/$B5 INT
182/$B6 ABS
183/$B7 USR
184/$B8 FRE
185/$B9 POS
186/$BA SQR
187/$BB RND
188/$BC LOG
189/$BD EXP
190/$BE COS
191/$BF SIN
192/$C0 TAN
193/$C1 ATN
194/$C2 PEEK
195/$C3 LEN
196/$C4 STR$
197/$C5 VAL
198/$C6 ASC
199/$C7 CHR$
200/$C8 LEFT$
201/$C9 RIGHT$
202/$CA MID$
203/$CB GO

Notice that the tokens break down into four "groups", which mirrors four of the "classes" of keywords/symbols.

  • Commands are in the range 128–162/$80–A2
  • Various "bywords" that form part of the syntax of the keywords above fall in the 163–169/$A3–$A9 range
  • Arithmetic and logic operators have token codes 170–179/$AA–$B3
  • Functions are in the range 180–202/$B4–CA

In BASIC ROM, the System variables TIME, TIME$, and STATUS, are handled as exceptions in the routines for handling "normal" variables.

GO (token code 203/$CB) breaks this "rule"; it is actually handled as a command, despite it's location "behind" the functions in the token system.