No polynomial can ever exactly match a transcendental function, but the machine precision is limited anyway. A polynomial can be chosen that will follow along, within the limits of machine precision, with the function to be computed over a limited part of its domain. The independent variable can be processed to limit the domain over which the polynomial must be accurate.
These are useful routines for authors of BASIC extensions, as they do much of the work in computing a floating point function.
Parameters and result
POLY1 can be called at $E043 on the C64. POLY2 can be called at $E059.
Both POLY1 and POLY2 accept a polynomial table with the address in Y and A, where Y has the high byte and A the low, and the independent variable in the floating point accumulator (FAC). The table is formatted as follows:
- The first byte is an integer constant, giving the degree of the polynomial. This is one less than the number of coefficients.
- The second and subsequent bytes carry the coefficients of the polynomial in C64 floating point format. The first coefficient applies to the highest degree term, and the last to the lowest term (X0 for POLY2, or X1 for POLY1).
POLY2 uses Horner's method to compute the value of an arbitrary polynomial, and leaves the result in the FAC.
POLY1 calls POLY2(table, X2)*X to provide a faster method of computing polynomials with only odd-powered terms. It also leaves the result in the FAC.
The polynomial table used in the LOG function is located at $B9C1:
logcn2: .byte 3 ; Four constants follow .byte $7F,$5E,$56,$CB,$79 ; 0x0.DE56CB79P-1 or 0.43425594189 .byte $80,$13,$9B,$0B,$64 ; 0x0.939B0B64P+0 or 0.57658454124 .byte $80,$76,$38,$93,$16 ; 0x0.F6389316P+0 or 0.96180075919 .byte $82,$38,$AA,$3B,$20 ; 0x0.B8AA3B20P+2 or 2.8853900731
LOG passes this table to POLY1; thus the constants apply to X7, X5, X3 and X1 terms. It adds -0.5 to the returned value; this constant can be considered part of its polynomial approximation, a term for X0.
These are the polynomials that the functions use:
|EXP|| 2.1498763701*10-5 * X7 + 1.4352314037*10-4 * X6 + 1.3422634825*10-3 * X5 + 9.6140170135*10-3 * X4
+ 5.5505126860*10-2 * X3 + 0.24022638460 * X2 + 0.69314718618 * X + 1.0
|SIN|| -14.381390672 * X11 + 42.007797122 * X9 - 76.704170257 * X7
+ 81.605223686 * X5 - 41.341702104 * X3 + 6.2831853069 * X
|LOG||0.43425594189 * X7 + 0.57658454124 * X5 + 0.96180075919 * X3 + 2.8853900731 * X - 0.5|| log2(sqrt(2)/(1 - X) - sqrt(0.5))
or: atanh(X)*2/log(2) - 0.5
| [-(3 - 2*sqrt(2)), +(3 - 2*sqrt(2)))
i.e.: [-0.17157287525, +0.17157287525)
|ATN|| -6.8479391189*10-4 * X23 + 4.8509421558*10-3 * X21 - 1.6111701843*10-2 * X19 + 3.4209638048*10-2 * X17
- 5.4279132761*10-2 * X15 + 7.2457196540*10-2 * X13 - 8.9802395378*10-2 * X11 + 0.11093241343 * X9
Notes on derivation
The Taylor polynomials for the above functions are similar to the ones coded in ROM:
|EXP|| 1.52527338041*10-5 * X7 + 1.54035303934*10-4 * X6 + 1.33335581464*10-3 * X5 + 9.61812910763*10-3 * X4|
+ 5.55041086648*10-2 * X3 + 0.240226506959 * X2 + 0.693147180560 * X + 1.0
|SIN|| -15.0946425768 * X11 + 42.0586939449 * X9 - 76.7058597531 * X7|
+ 81.6052492761 * X5 - 41.3417022404 * X3 + 6.28318530718 * X
|LOG||0.412198583111 * X7 + 0.577078016356 * X5 + 0.961796693926 * X3 + 2.88539008178 * X - 0.5|
|ATN|| -4.34782608696*10-2 * X23 + 4.76190476190*10-2 * X21 - 5.26315789474*10-2 * X19 + 5.88235294118*10-2 * X17|
- 6.66666666667*10-2 * X15 + 7.69230769231*10-2 * X13 - 9.09090909091*10-2 * X11 + 0.111111111111 * X9
The differences appear to be intentional and not a result of roundoff error or other faults in the derivation. Within the required domain, the polynomials coded in ROM are more accurate than the Taylor polynomials of like degree. Outside the domain, the Taylor polynomials continue to follow the functions for a short way, while the coded polynomials diverge rapidly. The Taylor polynomials can be made good enough by adding more terms, but doing so slows down the function, and floating point arithmetic on a 6502 is quite slow enough.
The polynomials might be derived from some kind of interpolation. Interpolating SIN(x*(2*π)) at the Chebyshev nodes, scaled to [-0.25, +0.25], yields this polynomial:
- -14.393966295 * X11 + 42.009779304 * X9 - 76.704279900 * X7 + 81.605226189 * X5 - 41.341702133 * X3 + 6.2831853069 * X
which resembles the ROM version more closely than the Taylor polynomial, and is "good enough" in that it matches SIN(X*(2*π)) within machine precision. It is not identical to the ROM polynomial, and so the actual derivation was somewhat different.