Recursion

From C64-Wiki
Jump to navigationJump to search

C64 BASIC is not naturally suited to recursion because all variables are "global" in scope and there is no provision for the temporary creation of local variables.

This does not mean that recursive subroutines can't be written in C64 BASIC. It is not even difficult to do so. There is only one golden rule that needs to be followed: With the exception of the variable(s) in which the answer is returned, all variable used by a recursive subroutine must have their values restored to what they were just before the subroutine was called before the subroutine returns .

Consider the example of a factorial function which is called recursively:

200 IF N<0 THEN F=-1:RETURN
210 IF N<2 THEN F=1:RETURN
220 N=N-1:GOSUB 200
230 N=N+1:F=F*N
240 RETURN

Lines 200 and 210 are the terminating cases (line 200 is an optional check for illegal values of N and can be omitted if desired). Line 220 recursively computes the factorial of N-1 and returns that value in in F. Line 230 restores the value of N to what it was and then calculates the factorial.

Another more interesting (and just as frequently used) example is the Tower of Hanoi problem. The basic idea is to move all of the discs one at a time from the left post to the right post using the middle post as an intermediate position. The complication is that a larger disk may not be placed on top of a smaller one.

If there was just one disk in the puzzle then the solution is trivially simple. Just move the disk from the left post to the right post. For n disks, the solution can be stated as

Move n-1 disks from the left post to the middle post (using the right post as an intermediate step).
Move 1 disk from the left post to the right post.
Move n-1 disks from the middle post to the right post (using the left post as an intermediate step).

This naturally defines a recursive solution to the problem and the BASIC code is shown below:

10 L$="LEFT":M$="MIDDLE":R$="RIGHT"
20 INPUT"HOW MANY DISKS";N
30 GOSUB 100
40 END
100 IF N>1 THEN 120
110 PRINT"MOVE DISK FROM "L$" POST TO "R$" POST":RETURN
120 N=N-1:T$=M$:M$=R$:R$=T$:GOSUB 100
130 PRINT "MOVE DISK FROM "L$" POST TO "M$" POST"
140 T$=R$:R$=M$:M$=L$:L$=T$:GOSUB 100
150 T$=L$:L$=M$:M$=T$:N=N+1:RETURN

Note that in line 130 when the subroutine has returned from its first recursive call, the M$ and R$ are still exchanged which is why the print instruction says "M$" instead of "R$". Line 150 ensures that all the important variables are restored to their original values before the subrouting returns.

This code actually breaks the "golden rule" by not restoring T$ before returning. In this case it is not necessary since the subroutine doesn't rely on the value of T$ being unchanged across recursive calls and no other part of the program uses T$. If in any doubt about whether a variable needs to be restored, restore it.

In cases where the former value of a variable can't be computed, the program needs to set up a stack to temporarily store its value before the subroutine is recursively called. This only adds a small amount of complexity to the coding required.

In the final example below, the program reads a file of 500 names and stores the names into a string array. It then calls a quicksort subroutine to sort the names then prints the sorted names to the screen. Quicksort is a recursive subroutine that sorts part of an array (from element LO to element HI) by partitioning it into 2 sub-arrays and recursively calling itself to sort the 2 sub-arrays.

10 GOTO 1000
196 REM
197 REM QUICKSORT - SORT SECTION OF ARRAY A$
198 REM BETWEEN LO AND HI
199 REM
200 IF LO>=HI THEN RETURN:REM DON'T SORT SINGLE ELEMENT ARRAYS
210 PI=INT(RND(1)*(HI-LO+1))+LO
220 PV=I%(PI):I%(PI)=I%(HI):I%(HI)=PV:I=LO-1:J=HI
230 I=I+1:IF A$(I%(I))<A$(PV) then 230
240 J=J-1:IF A$(I%(J))>A$(PV) AND J>LO THEN 240
250 IF I>=J THEN 280
260 T=I%(I):I%(I)=I%(J):I%(J)=T
270 GOTO 230
280 I%(HI)=I%(I):I%(I)=PV
290 RP=RP+1:RS%(RP)=HI
300 HI=I-1:GOSUB 200
310 I=HI+2:HI=RS%(RP):RP%(RP)=LO
320 LO=I:GOSUB 200
330 LO=RS%(RP):RP=RP-1:RETURN
1000 MX=500:DIM A$(MX):DIM I%(MX)
1010 DIM RS%(50):RP=0:LO=1:HI=MX:I=RND(0)
1020 OPEN 8,8,8,"0:NAMES,S,R"
1030 FOR I= 1 TO MX
1040 INPUT#8,A$(I):I%(I)=I
1050 NEXT I:CLOSE 8
1060 GOSUB 200
1070 FOR I=1 TO MX:PRINT A$(I%(I))" ";:NEXT I
1080 PRINT


It might be noticed that this program doesn't exchange strings but only exchanges string indexes. The reason for this is that C64 BASIC doesn't actually exchange strings that are in the string space. Instead it makes new copies of the strings and leaves the original ones as "garbage" in the string space.

So a line like T$=A$(I):A$(I)=A$(J):A$(J)=T$ would create two new copies of A$(I) and one new copy of A$(J) thus leaving three strings as garbage in the string space. When sorting a large array of strings like this, the string space would quickly fill up and cause BASIC's internal garbage collector to be run. This would significantly slow down the program because of the large number of strings involved. By using an array of string indexes, this problem is averted.

This problem does not occur in the 'Tower of Hanoi program above because the strings are in program space and not string space. It would not matter anyway because with only 4 strings in the program, garbage collection would be virtually instantaneous.

Limitations of recursion[edit | edit source]

Recursion is often a "natural" way to solve a programming problem (the solution to Tower of Hanoi problem would be difficult to find without it) and frequently leads to compact code. However, it is not the most efficient way to solve the problem. While running, a recursive routine may take up a lot of memory saving intermediate results and return addresses.

The Commodore 64 processor only has 256 bytes allocated for a return stack so recursive programming can fill it up rapidly. In fact, GOSUBs in C64 BASIC can only be nested 23 deep (assuming there are no FOR/NEXT loops). So in the Factorial program listed above, attempting to find the factorial of 24 would cause a ?OUT OF MEMORY ERROR.

Another problem is that the GOSUB/RETURN pair take time to execute. If a recursive subroutine only calls itself once (such as in the factorial program), then the running time increases linearly with N. However, if it calls itself 2 or more times before exiting then the running time increases exponentially with N.

Consider the Fibonacci sequence. It can be defined recursively as below:

F(1) = 1
F(2) = 1
F(N) = F(N-1) + F(N-2)

A suitable program for this is shown below (note the need for a stack to temporarily store the results of the first subroutine call);

10 DIM S(25):INPUT "N";N
20 T=TI
30 GOSUB 100
40 PRINT"FIBONACCI"N"="F
50 PRINT"CALCULATION TOOK"TI-T"JIFFY TICKS"
60 STOP
100 IF N<=2 THEN F=1:RETURN
110 N=N-1:GOSUB 100:S=S+1:S(S)=F
120 N=N-1:GOSUB 100:F=F+S(S)
130 S=S-1:N=N+2:RETURN

The problem is that the Fibonacci routine calls itself recursively twice for every value of N from 3 to N. If C(N) is the count of the number of times that the routine is called then a recursive relation can be defined: C(N) = 1 + C(N-1) + C(N-2) which for large values of N has a solution of the form C(N) = 2^N calls.

When this code was tested, it found that for N=5, the time took 14 jiffy ticks and when N=10, the time took 115 jiffy ticks. The running time of the program for N=23 (the maximum number that the Fibonacci can be calculated recursively) was 58337 jiffy ticks (16 minutes and 12 seconds)!

For these reasons, recursive subroutines should instead be coded using FOR/NEXT loops whenever possible. A suitable alternative for the Fibonacci program is shown below:

10 INPUT "N";N
20 T=TI
30 GOSUB 100
40 PRINT"FIBONACCI"N"="F
50 PRINT"CALCULATION TOOK"TI-T"JIFFY TICKS"
60 STOP
100 IF N<=2 THEN F=1:RETURN
110 F1=1:F2=1
120 FOR I=2 TO N
130 F=F1+F2:F1=F2:F2=F
140 NEXT:RETURN

The running time for N=23 was a much more respectable 14 jiffy ticks. In fact, the running time for N=184 was only 90 jiffy ticks (the Fibonacci of 185 or more is too large to fit into the C64 real number format and results in a ?OVERFLOW ERROR message).