Datassette Encoding

From C64-Wiki
Jump to navigationJump to search

Bit Encoding[edit | edit source]

Encoding of the bit values and markers (NTSC)

Many computers of the home computer era around 1980 used FSK (Frequency-shift keying) to encode the "1" and "0" on tape, using different frequencies for each - e.g. the Kansas City standard.[1] The Commodore Datassette is making use of a pulse length encoding, which "The PET Revealed" calls superior to FSK.

There are three sorts of pulses, that are made use of:

  • a short 176 µs pulse (2840 Hz)
  • a medium 256 µs pulse (1953 Hz)
  • a long 336 µs pulse (1488 Hz)

Actually, the pulses occur as one full period of the respective frequency, so a short pulse is 176 µs of HIGH and 176 µs of LOW level etc.

The bit values and the required markers are encoded with combinations of two of these pulse periods.

  • The bit value "0" is encoded as a short pulse period followed by a medium pulse period
  • The bit value "1" is encoded as a medium pulse period followed by short pulse period
  • The byte marker is encoded as a long pulse period followed by a medium pulse period
  • The end-of-data marker is encoded as a long pulse period followed by a short pulse period

The pulse durations vary between NTSC and PAL machines. Since the NTSC clock is approximately 3.8 % faster (1.0227273 MHz for NTSC vs. 0.9852486 MHz for PAL), the PAL pulses are 3.8 % longer. However, it is no problem to exchange cassettes between NTSC and PAL C64s. The speed of the tape motor varies, too, so there are synchronization algorithms integrated into the data decoding.

The different PAL pulse lengths and frequencies are:

  • a short 182.7 µs pulse (2737 Hz)
  • a medium 265.7 µs pulse (1882 Hz)
  • a long 348.8 µs pulse (1434 Hz)

The calculated duration of a byte marker on an NTSC machine is 1184 µs and on a PAL system, it is 1229 µs.

The read signal is inverted to the write signal. While the pulse width is measured between the rising edges for the write signal, it is measured between the falling edges for the read signal.

Byte Encoding[edit | edit source]

Datassette byte encoding

The byte marker indicates the start of a byte. The bits are recorded with the least significant bit (LSB) first and a parity bit (odd parity) follows the most significant bit (MSB).

"Odd parity" means that the number of "1s" in the data bits plus the parity bit is odd. E.g., for 00100010 (even number of 1s) the parity bit would be 1, for 00000111 (odd number of 1s) the parity bit would be 0. This is generated by sequentially XORing a "1" and all 8 payload bits.

Each byte has a recorded duration of 8.96 ms.

Data Block Encoding[edit | edit source]

Structure of a file and the data blocks

The 192 bytes of payload are stored twice in one block. Beside the checksum, this is for detecting data integrity problems due to auto dropouts.

Each data block starts with a synchronization leader of short pulses (2840 Hz for NTSC). This is either 10 seconds for the first block or 2 seconds for every other block. The leader provides time for the tape motor to reach the correct speed. Also the Kernal calculates a speed correction factor during this time, since tape speed might vary for different motors. For this is the reason, despite the different PAL and NTSC clock frequencies, there is no issue with swapping tapes between systems.

Each block of payload data is preceded by a countdown byte sequence. The countdown has the MSB set for the first copy of the payload data, counting from $89 to $81 and cleared for the 2nd copy of the payload data, counting from $09 to $01.

Each data block is followed by a one-byte checksum. It is calculated by sequentially XORing $00 and all payload bytes.

The inter-record gaps start with a long pulse period, followed by 60 short pulse periods (2840 Hz).

The end-of-data marker is an optional symbol, marking the last data block.

Header Block[edit | edit source]

The header block is the first data block in a file and is exactly 192 bytes in length. The header payload consists of the file type, the start and end address (used for certain header types) and the file name.

Byte Length Content
1 1 Header Type
2 1 Start address (low byte)
3 1 Start address (high byte)
4 1 End address (low byte)
5 1 End address (high byte)
6 - 21 12 Filename, displayed in the FOUND message
22 - 192 171 Filename, not displayed in the FOUND message

In case the file name is shorter than 16 characters, it is padded with spaces (ASCII: $20). The bytes 22 - 192 are usually filled with $20. In case a file name is used, that is longer than 16 bytes, only the first 16 characters are displayed in the FOUND message, all other characters are still valid, but will not be displayed in the found message. They can be accessed with the PEEK instruction, though.

Header Type[edit | edit source]

Value Header Type
$01 relocatable (BASIC) program
$02 data block for ASCII/sequential file
$03 non-relocatable program (usually machine language)
$04 ASCII file header
$05 End-of-tape marker (EOT)

Header Type $01[edit | edit source]

This header type denotes relocatable programs. In general, these are BASIC programs which do not necessarily require to be located at specific addresses. These programs are loaded at the start of BASIC RAM.

Header Type $02[edit | edit source]

This type denotes a data block of a sequential (ASCII) file. The bytes 1-192 (which is 191 bytes) contain the payload data.This block does not make use of the start and end address.

Header Type $03[edit | edit source]

This type denotes non-relocatable programs - usually those are machine language programs or programs with a machine language section. They require to be loaded to a certain start address.

Header Type $04[edit | edit source]

This type denotes the header of an ASCII file. Besides the header type, the payload of this block contains the file name.

Header Type $05[edit | edit source]

This type denotes an End-of-tape block. In case the EOT is reached, before a header with the desired file name is reached, a ?DEVICE NOT PRESENT ERROR is reported.

Header Type and Secondary Address[edit | edit source]

The secondary address, which is used, when a file is saved or opened for writing influences the header type, that is recorded on tape.

Programs[edit | edit source]

Loading[edit | edit source]
LOAD"Name",1 Loads a program of header type $01 at start address of BASIC memory. Type $03 will be loaded to the recorded start address
LOAD"Name",1,1 Always loads a program to the recorded start address
Saving[edit | edit source]
SAVE"Name",1 Saves a (BASIC) program with header type $01
SAVE"Name",1,1 Saves a (machine language) program with header type $03
SAVE"Name",1,2 Saves a (BASIC) program with header type $01 with an additional EOT block
SAVE"Name",1,3 Saves a (machine language) program with header type $03 with an additional EOT block

Opening and ASCII file[edit | edit source]

OPEN1,1,0,"Name" Opens a file for reading
OPEN1,1,1,"Name" Opens a file for writing
OPEN1,1,2,"Name" Opens a file for writing with an additional EOT block

Links[edit | edit source]

References[edit | edit source]

  1. Nick Hampshire: "The PET Revealed", 1980, pp. 135-142