BASIC token

From C64-Wiki
Jump to navigationJump to search

A BASIC token is a single-byte representation of a BASIC keyword: Whenever the user edits or creates a BASIC line, any keywords are replaced by their respective token, and conversely; when the user LISTs the BASIC program, the tokens are displayed as the keywords they represent, in "plain text". Tokens only ever exist in the program storage area in memory, they are not visible to the user. Because they are a single byte, the resulting tokenized code requires less storage than the original text format.

BASIC tokens are not to be confused with BASIC keyword abbreviations.

Tokens are distinguished from "plain" PETSCII characters by the fact that token codes are always greater than or equal to 128/$80; i.e. the most significant bit in the byte representing a token is always set. This allows the interpreter to easily distinguish between tokens and other text at runtime. Note that only the keywords are tokenized in MS BASICs, other items like variable names, numeric and string constants, line numbers and other items remain in their original text format. Tokenization takes place when a line is entered or modified; the user's text is initially placed in a temporary area in memory, a buffer, and a routine known as the "chunker" copies it into the program storage area, replacing keywords with tokens as it goes.

Commodore BASIC V2 has 68 keywords and 8 operators, which are assigned token codes in the range from 128/$80 thru 203/$CB. Since the code 255/$FF is reserved for the "pi" character, this leaves 51 unused tokens in the range from 204–254/$CC–FE: Because of the ample use of vectors in the BASIC system, third-party BASIC expansions may use these for additional BASIC commands.

Example[edit | edit source]

If you type NEW, and then enter the BASIC program line

10 PRINT "HELLO WORLD"

the BASIC program memory (starting from 2049/$0801) will contain:

Location Data Description
2049/$0801 21/$15 8/$08 Pointer to beginning of "next" BASIC line, in low-byte/high-byte order
2051/$0803 10/$0A 0/$00 BASIC line number "10", in low-byte/high-byte order
2053/$0805 153/$99 The token for the PRINT keyword
2054/$0806 32/$20 34/$22 SPACE and quote characters following PRINT
2056/$0808 ... 2066/$0812 PETSCII codes for the "hello world" text
2067/$0813 34/$22 Quote at end of PRINTed text.
2068/$0814 0/$00 Zero-byte marking the end of the BASIC line.
2069/$0815 0/$00 0/$00 Two zero-bytes in place of the pointer to next BASIC line indicates the end of the program.

This token system saves space in memory and helps in slightly speeding up execution time, both in the computer and when the program is saved to and retrieved from tape or disk.

Token-to-keyword conversion table[edit | edit source]

BASIC 2.0[edit | edit source]

The following table lists all the reserved keywords and symbols, according to their associated token code:

128/$80 END
129/$81 FOR
130/$82 NEXT
131/$83 DATA
132/$84 INPUT#
133/$85 INPUT
134/$86 DIM
135/$87 READ
136/$88 LET
137/$89 GOTO
138/$8A RUN
139/$8B IF
140/$8C RESTORE
141/$8D GOSUB
142/$8E RETURN
143/$8F REM
144/$90 STOP
145/$91 ON
146/$92 WAIT
147/$93 LOAD
148/$94 SAVE
149/$95 VERIFY
150/$96 DEF
151/$97 POKE
152/$98 PRINT#
153/$99 PRINT
154/$9A CONT
155/$9B LIST
156/$9C CLR
157/$9D CMD
158/$9E SYS
159/$9F OPEN
160/$A0 CLOSE
161/$A1 GET
162/$A2 NEW
163/$A3 TAB(
164/$A4 TO
165/$A5 FN
166/$A6 SPC(
167/$A7 THEN
168/$A8 NOT
169/$A9 STEP
170/$AA + (Addition)
171/$AB − (Subtraction)
172/$AC * (Multiplication)
173/$AD / (Division)
174/$AE ↑ (Power)
175/$AF AND
176/$B0 OR
177/$B1 > (greater-than operator)
178/$B2 = (equals operator)
179/$B3 < (less-than operator)
180/$B4 SGN
181/$B5 INT
182/$B6 ABS
183/$B7 USR
184/$B8 FRE
185/$B9 POS
186/$BA SQR
187/$BB RND
188/$BC LOG
189/$BD EXP
190/$BE COS
191/$BF SIN
192/$C0 TAN
193/$C1 ATN
194/$C2 PEEK
195/$C3 LEN
196/$C4 STR$
197/$C5 VAL
198/$C6 ASC
199/$C7 CHR$
200/$C8 LEFT$
201/$C9 RIGHT$
202/$CA MID$
203/$CB GO

Notice that the tokens break down into four "groups", which mirrors four of the "classes" of keywords/symbols.

  • Commands are in the range 128–162/$80–A2
  • Various "bywords" that form part of the syntax of the keywords above fall in the 163–169/$A3–$A9 range
  • Arithmetic and logic operators have token codes 170–179/$AA–$B3
  • Functions are in the range 180–202/$B4–CA

In BASIC ROM, the System variables TIME, TIME$, and STATUS, are handled as exceptions in the routines for handling "normal" variables.

GO (token code 203/$CB) breaks this "rule"; it is actually handled as a command, despite it's location "behind" the functions in the token system.

Other Commodore BASICs[edit | edit source]

Other Commodore BASICs define additional tokens. BASIC 3.5, found in the Plus/4 and Commodore 16, and BASIC 7.0, found in the Commodore 128, fill most of the available space. These two BASICs recognize many of the same keywords. BASIC 7.0 repurposes the 206 token; instead of RLUM, it uses 206 as a shift code to make several two-byte tokens. It also uses the 254 code, unused in BASIC 3.5, to make more two-byte tokens.

BASIC 4.0, found in some PETs, provides some additional keywords, some of which occur also in BASIC 3.5 and 7.0.

Token code BASIC 3.5 BASIC 7.0 BASIC 4.0
204/$CC RGR RGR CONCAT
205/$CD RCLR RCLR DOPEN
206/$CE RLUM shift DCLOSE
206/$CE 2/$02 POT
206/$CE 3/$03 BUMP
206/$CE 4/$04 PEN
206/$CE 5/$05 RSPPOS
206/$CE 6/$06 RSPRITE
206/$CE 7/$07 RSPCOLOR
206/$CE 8/$08 XOR
206/$CE 9/$09 RWINDOW
206/$CE 10/$0A POINTER
207/$CF JOY JOY RECORD
208/$D0 RDOT RDOT HEADER
209/$D1 DEC DEC COLLECT
210/$D2 HEX$ HEX$ BACKUP
211/$D3 ERR$ ERR$ COPY
212/$D4 INSTR INSTR APPEND
213/$D5 ELSE ELSE DSAVE
214/$D6 RESUME RESUME DLOAD
215/$D7 TRAP TRAP CATALOG
216/$D8 TRON TRON RENAME
217/$D9 TROFF TROFF SCRATCH
218/$DA SOUND SOUND DIRECTORY
219/$DB VOL VOL
220/$DC AUTO AUTO
221/$DD PUDEF PUDEF
222/$DE GRAPHIC GRAPHIC
223/$DF PAINT PAINT
224/$E0 CHAR CHAR
225/$E1 BOX BOX
226/$E2 CIRCLE CIRCLE
227/$E3 GSHAPE GSHAPE
228/$E4 SSHAPE SSHAPE
229/$E5 DRAW DRAW
230/$E6 LOCATE LOCATE
231/$E7 COLOR COLOR
232/$E8 SCNCLR SCNCLR
233/$E9 SCALE SCALE
234/$EA HELP HELP
235/$EB DO DO
236/$EC LOOP LOOP
237/$ED EXIT EXIT
238/$EE DIRECTORY DIRECTORY
239/$EF DSAVE DSAVE
240/$F0 DLOAD DLOAD
241/$F1 HEADER HEADER
242/$F2 SCRATCH SCRATCH
243/$F3 COLLECT COLLECT
244/$F4 COPY COPY
245/$F5 RENAME RENAME
246/$F6 BACKUP BACKUP
247/$F7 DELETE DELETE
248/$F8 RENUMBER RENUMBER
249/$F9 KEY KEY
250/$FA MONITOR MONITOR
251/$FB USING USING
252/$FC UNTIL UNTIL
253/$FD WHILE WHILE
254/$FE none shift
254/$FE 2/$02 BANK
254/$FE 3/$03 FILTER
254/$FE 4/$04 PLAY
254/$FE 5/$05 TEMPO
254/$FE 6/$06 MOVSPR
254/$FE 7/$07 SPRITE
254/$FE 8/$08 SPRCOLOR
254/$FE 9/$09 RREG
254/$FE 10/$0A ENVELOPE
254/$FE 11/$0B SLEEP
254/$FE 12/$0C CATALOG
254/$FE 13/$0D DOPEN
254/$FE 14/$0E APPEND
254/$FE 15/$0F DCLOSE
254/$FE 16/$10 BSAVE
254/$FE 17/$11 BLOAD
254/$FE 18/$12 RECORD
254/$FE 19/$13 CONCAT
254/$FE 20/$14 DVERIFY
254/$FE 21/$15 DCLEAR
254/$FE 22/$16 SPRSAV
254/$FE 23/$17 COLLISION
254/$FE 24/$18 BEGIN
254/$FE 25/$19 BEND
254/$FE 26/$1A WINDOW
254/$FE 27/$1B BOOT
254/$FE 28/$1C WIDTH
254/$FE 29/$1D SPRDEF
254/$FE 30/$1E QUIT
254/$FE 31/$1F STASH
254/$FE 32/$20 none
254/$FE 33/$21 FETCH
254/$FE 34/$22 none
254/$FE 35/$23 SWAP
254/$FE 36/$24 OFF
254/$FE 37/$25 FAST
254/$FE 38/$26 SLOW

All of the additional tokens in BASIC 4.0 occur in BASIC 7.0, but with different token codes. Some also occur in BASIC 3.5.

Keyword BASIC 4.0 BASIC 3.5 BASIC 7.0
CONCAT 204/$CC 254/$FE 19/$13
DOPEN 205/$CD 254/$FE 13/$0D
DCLOSE 206/$CE 254/$FE 15/$0F
RECORD 207/$CF 254/$FE 18/$02
HEADER 208/$D0 241/$F1 241/$F1
COLLECT 209/$D1 243/$F3 243/$F3
BACKUP 210/$D2 246/$F6 246/$F6
COPY 211/$D3 244/$F4 244/$F4
APPEND 212/$D4 254/$FE 14/$0E
DSAVE 213/$D5 239/$EF 239/$EF
DLOAD 214/$D6 240/$F0 240/$F0
CATALOG 215/$D7 254/$FE 12/$0C
RENAME 216/$D8 245/$F5 245/$F5
SCRATCH 217/$D9 242/$F2 242/$F2
DIRECTORY 218/$DA 238/$EE 238/$EE