Linked lists

From C64-Wiki
Jump to: navigation, search

Records[edit]

In order to understand the concept of linked lists it is necessary to understand the concept of a data structure or record as it is known in Pascal.

A record is a special user-defined data type that contains other types of data. For example, a record of a person might contain strings to store his name and phone number and an integer or real number to store his age. If a variable called "stu" was of this data type then then elements of this record could be accessed by using "stu.name", "stu.age" and "stu.phone". By defining an array of such records, one could build up a database of people.

C64 BASIC has no provision for defining arrays of records like this. However, a similar effect can be achieved by setting up "parallel arrays". For example:

10 DIM NM$(100):DIM PH$(100):DIM AG(100)

A single index number could be used to access all three particulars of an individual.

Definition[edit]

A linked list is a subset of of a type of data structure known as a tree which is basically a set of interconnected nodes.

A node is a data structure or record as defined above which has an additional element: a link or pointer to another node. This is a variable that contains the memory address of another node. If there is no other node to point to then this variable usually contains a zero to indicate a null node. This article deals only with singly-linked lists. The nodes in these lists have only one link pointer. However the principles are the same as those for nodes that have more than one link pointer (multiply-linked lists and trees etc).

Advantages and Disadvantages[edit]

An advantage that linked lists have over arrays is that in a linked list, nodes can be added or removed from the list by simply changing the link pointers. There is no need to shift the other nodes to make space for the new node or fill in the gap left by the removed node. For this reason, C64 BASIC uses a linked list to store a program in memory. Each line of BASIC code includes two bytes that contain the address of the next line to be read and executed.

The disadvantage with linked lists is that unlike arrays, there is no way to directly access a node in the middle (or at the end) of the list. The program needs to start at the first node (also known as the "root" node) and follow the links until the desired node is reached. This is also a problem with C64 BASIC. Any time the program encounters a GOTO or GOSUB statement, it needs to start at the beginning of the program and search through the links until it finds the desired line number. One way to speed up a program in C64 BASIC is to ensure that the destination line number of every GOTO or GOSUB statement is as low as possible.

Implementation in BASIC[edit]

Since C64 BASIC has no provision for records or variable addresses, the way to implement linked lists in BASIC is to set up a parallel array of integers to store the index number of the next record.

Linked lists generally require a "memory manager" that will allocate memory for a new node and free the memory when the node is no longer required. The analogous setup in BASIC is an "index manager" which keeps track of which indexes are in use and which are free.

A set up for a linked list of 1000 free names is shown below. After setting up the array, it chains each element in the array to the next element via the NX% pointer array. The index number of the first free element in the list is stored in FP. (The purpose of line 1120 is discussed later in this article).

10 MN=1000:GOSUB 1100:GOTO 10000
1097 REM
1098 REM SET UP DATA STRUCTURES
1099 REM
1100 DIM NM$(MN):DIM NX%(MN)
1110 NX%(2)=0:FP = MN
1120 NM$(1)="SENTINEL":NX%(1)=0
1130 FOR I=3 TO MN:NX%(I)=I-1:NEXT
1140 RETURN

It might be noticed in this example and the ones that follow that each subroutine is preceded by several REM statements. These are included only to make it easier to recognize the purpose of the subroutine. It does not add to the running time of the program because none of the REM statements are ever executed. However, they take up memory so the programmer may wish to delete the REM statements once the program is running properly.

Allocating and freeing an index[edit]

Allocating an index means simply returning the value stored in FP and advancing the FP to the next free element in the array (assuming that FP is not 0).

1196 REM
1197 REM ALLOCATE FREE SPOT IN SLIST ARRAY
1198 REM RETURN VALUE IN P
1199 REM
1200 P = FP:IF P THEN FP = NX%(P)
1210 RETURN

Freeing an index is the reverse of the above procedure. It will be noticed that the "free" routine below presumes that once the node is released that the NAME in that position will no longer be accessed. In that case, it is essential to make that string an empty string so that the memory previously used by the string will be made available to the BASIC interpreter again.

1296 REM
1297 REM RELEASE INDEX IN SLIST ARRAY
1298 REM INDEX POINTED TO BY P
1299 REM
1300 IF P=1 THEN RETURN:REM DON'T FREE SENTINEL NODE
1310 NM$(P)="":NX%(P)=FP:FP=P:RETURN

Pointing to the first node[edit]

Usually a variable is required to point to the first node in the list. If the list is empty then this variable contains a zero (NULL). In this case a slightly different approach is used. The first node in the parallel array is called a "sentinal" node. Its data is irrelevant but the link points to the first node in the list (or NULL if no nodes have been added).

Since the "sentinal" node is an actual node rather than just a pointer, it can be used as a parameter in many of the operations that follow. If a "sentinal" node is not used then every subroutine would have to test for the "empty list" case and provide alternative coding for this case.

Adding and deleting nodes at the head of the list[edit]

The examples below show how to insert or delete a node at the head of the list. In each case, the node in question is the first node pointed to by the sentinel node.

1396 REM
1397 REM INSERT NODE POINTED TO BY P
1398 REM TO HEAD OF SLIST
1399
1400 NX%(P)=NX%(1):NX%(1)=P:RETURN

1496 REM
1497 REM PLACE ENTRY E$ INTO A NODE
1498 REM AND INSERT AT HEAD OF SLIST
1499 REM
1500 GOSUB 1200:NM$(P)=E$:GOTO 1400:REM INSERT THE NODE

1797 REM 
1798 REM DELETE NODE AT HEAD OF SLIST
1799 REM
1800 P = NX%(1):IF P=0 THEN RETURN
1810 NX%(1)=NX%(P):GOTO 1300:REM FREE THE NODE INDEX

Dealing with nodes other than the first node[edit]

Adding or deleting nodes elswhere in the list is just as easy as dealing with the first node. The process consists of searching for the node then dealing with it as required.

It will be noticed that the "search" subroutine returns the previous node as well as the node desired. This will be necessary if inserting a node before the current node or deleting the current node. This also shows the value of using a sentinal node since the current node may turn out to be the first node added to the list.

1696 REM
1697 REM SEARCH SLIST FOR E$
1698 REM P=NODE OR 0, M=PREVIOUS NODE
1699 REM
1700 M=1:P=NX%(1)
1710 IF P=0 THEN RETURN
1720 IF NM$(P)=E$ THEN RETURN
1730 M=P:P=NX%(M):GOTO 1710

1897 REM
1888 REM REMOVE NODE CONTAINING E$ FROM SLIST
1899 REM
1900 GOSUB 1700:IF P=0 THEN RETURN
1910 NX%(M)=NX%(P):GOTO 1300

Other routines[edit]

Routines to add nodes or to do less than or equal searches are similar to the code snippets shown above so the necessary code is not shown here.

The main program is presumed to start at line 10000. Management of the list including how the data for the list is obtained is also left to the reader to decide.