Escape from Color BASIC:
BASIC09 for Color BASIC Users
James Jones
In memory of Larry Crane
sine quem non
Microware’s BASIC09 manual is thorough and gives excellent advice on programming style and program design. Dale Puckett’s The Official BASIC09 Tour Guide is a fine text. Why another BASIC09 book?
We’ll use Consolas for source code, names of programs, output of programs, and descriptions of syntax. BASIC09 is a programming language; basic09 is a program that lets you create, edit, run, and debug BASIC09 code.
We’ll use an informal EBNF (Extended Backus-Naur Form) to describe the syntax of a BASIC09 statement or expression. For example, the IF statement has the syntax
IF BOOLEAN-expression THEN |
or in prose, “An IF statement is IF, followed by an expression of type BOOLEAN, followed by THEN, followed by one or more statements, optionally followed by ELSE and one or more statements, and ending with ENDIF”. Superscript asterisk means “zero or more”, and superscript plus indicates “one or more”. Things in brackets are optional, and parentheses are used for grouping. I will use italics for those and for nonterminal symbols[2], and non-italics for terminal symbols so you can tell them apart.
Don’t let the buzzwords scare you; you know more BNF than you think. The “BASIC SUMMARY” in Getting Started with Extended Color BASIC describes IFs with IF test THEN action1 ELSE action2. Look familiar?
If you’ve used Windows, you’ll know that names of files are case-independent. OS-9’s RBF, the file manager for disks, behaves that way, too. If there’s a file named BLAH in the current data directory that you want to list, list blah will do the job, or even list BlAh. The module directory is the same—you could have typed LIST blah and it would work just as well. OS-9 will remember the case used when the file or module was created, but will do case-insensitive searches. Even shell works that way; feel free to type CHD /DD/CMDS.
One way RBF is not like Windows or Disk BASIC: there is no forced file name extension. Ending a file name with a period followed by some letters or numbers is purely a convention, just as it’s purely a convention to use upper case for names of directories. I use .b09 as an “extension” for files containing BASIC09 source code, but that’s just me.
BASIC09 is similarly case insensitive, though like RBF, when you name a variable it remembers the case you used. (There’s one rather irritating exception when you edit code.)
The differences start from the beginning.
The BASIC09 manual says it best:
“BASIC09 was conceived in 1978 as a high-performance programming language to demonstrate the capabilities of the 6809 microprocessor to efficiently run high-level languages. BASIC09 was developed at the same time as the 6809 under the auspices of the architects of the 6809. The project covered almost two years, and incorporated the results of research in such areas as interactive compilation, fast floating point arithmetic algorithms, storage management, high-level symbolic debugging, and structured language design. These innovations give BASIC09 its speed, power, and unique flavor.”
On the other hand…
Microsoft, then Micro-Soft, developed a BASIC interpreter for the Altair 8800 microcomputer. The Altair 8800 kit came with a 1K RAM board populated with only 256 bytes, but the first BASIC interpreter needed at least one populated 4K RAM board—still tiny compared with the 64K the Altair’s 8080 could access. The next version needed 8K, and added strings; an “extended” version that had IF THEN ELSE, integer and double precision needed 12K.
As microcomputers with different processors came out, Microsoft ported its interpreter to them. Eventually there was BASIC-69, the 6809 version that became Color BASIC.
Alas, Microsoft kept the design decisions forced on them with only 4K. That makes Color BASIC programs run far slower than they would were those decisions changed.
The differences between Color BASIC and BASIC09 are evident from the start.
Typically microcomputers of the day came with BASIC on ROM, and that BASIC was almost always one of the aforementioned Microsoft BASIC variants[3]. Turn it on, and you’re in BASIC. OTOH, a typical OS-9 installation sets you up running a “shell”, a command-line interface, and you have to type basic09 and hit ENTER to tell the shell to run it[4].
In Color BASIC, you’re either running a program or you’re not.
basic09 has more possible “modes” it can be in:
Actually, one of the best ways to show the relationships between the modes, or as they’re usually called, the states of a system is with a state transition diagram. The BASIC09 manual has one, about as well-made as possible in the days of daisy-wheels. (TODO: add an SVG state transition diagram here.)
You type basic09 at the shell, and it displays a banner, and then gives you the B: prompt of system mode. The first time around there’s no source code to load, so the only way to create a program is to edit it into existence. You type edit, or if you’re the impatient sort, just e, hit ENTER, and find yourself in edit mode, with the editor waiting for you to do something… but what’s this on the preceding line? It’s the next major difference between BASIC09 and Color BASIC:
PROCEDURE Program |
A Color BASIC program is a monolith, a “big ball of mud”. All variables are global, and anything can be changed anywhere, whether it should be or not. A BASIC09 program, on the other hand, is a set of procedures[5]. One may suffice for a small task like “hello world”. Another program might be best done recursively and thus still fit in a single procedure. Variables are local to a procedure, existing only while it runs (or waits for some procedure it’s running to end). Skillful decomposition of a program into procedures can make it use less RAM and be far clearer than a Color BASIC version.
The edit (e for short) command has an optional parameter, a name. If you’d typed e foo, you’d have seen “PROCEDURE foo”. basic09 remembers the last-edited procedure if there is one and opens it by default. The first time around, it uses Program.
NOTE: If you save a procedure to a file, the file will start with a line reading “PROCEDURE foo” (or whatever you named it), but in edit mode, you can’t get to that “line”. To rename a procedure, get into system mode and use the rename command.
Let’s keep going. We type a space (to insert the rest of the line at the current position) followed by that cliched line:
print “hello, world”
and hit ENTER. That’s all there is to the canonical first program, so we type q and hit ENTER to get back to system mode...wait a minute! Where’s the line number?
Even in BASIC09 you may have to use line numbers. It has remnants of the bad old days of BASIC:
I suspect they’re meant to make migration to BASIC09 easier, just as defaults for undeclared variables more or less match other BASICs. That said, only targets of the above statements need line numbers. In BASIC09, only use line numbers when absolutely necessary. That won’t be often.
From the BASIC09 manual: “[Line numbers] make programs harder to understand, they use additional memory space, and they increase compile time considerably.” Line numbers destroy structure: in Color BASIC, answering the question “How can I get to this line?”, which should be trivial, requires examining every single line of code.
Not only that, line numbers slow your code down. The June 1980 issue of Interface Age magazine included a mind-numbingly inefficient BASIC program to print out primes between 1 and 1000. A BASIC09 version that used REAL just as street BASIC would and kept the IF … GOTOs ran in 2:28… but that was with all unused line numbers removed. With superfluous line numbers, it took five seconds longer.
We’re back in system mode. If we type list and hit ENTER we see
PROCEDURE Program |
followed by Ready and then the B: prompt of system mode, but what’s with that 0000, and how’d print get capitalized?
The listing doesn’t quite look like your source code because it isn’t your source code.
basic09 tries to convert each line of source you enter or tell it to load to I-code. I-code differs from what it’s often likened to, e.g. Pascal P-code or Java bytecode. Pascal and Java are compiled. basic09, though, is a text-based interactive development environment (IDE) for BASIC09[6]. It must run your code efficiently and let you see and edit it in the form you expect, so I-code has instructions corresponding to each line of the various (mostly structured) control flow statements instead of simple conditional branches.
I-code doesn’t refer to variables by their names or statements by their numbers, so basic09 doesn’t have to do linear searches for lines or names of variables over and over again at run time. Constants in I-code are stored in internal form, so basic09 doesn’t have to convert them over and over again at run time. Expressions in I-code are kept in RPN[7] form, so basic09 doesn’t have to parse them over and over—I hope you sense a pattern developing here.
To let you list, edit, and debug your code, basic09 keeps information around for each procedure that is in BASIC09 and hasn’t been packed:
That information is not used in execution mode. Every name is only there once, and basic09’s symbol table search is case insensitive, so if you first type camelCase as a variable name, later on if you type CAmeLCasE or camelcase, it will appear as camelCase in the listing and be the same variable.
How does Color BASIC represent a program? Thanks to the author of the “Unravelled Series”, a set of books with a discussion and disassembly of the various Color BASIC ROMs, we know a line of Color BASIC source is represented by
How about variables? They’re in a symbol table, an array of seven-byte entries made up of
There’s a separate table for arrays.
Keeping Color BASIC source nearly unchanged means the interpreter has to repeatedly parse expressions[8], reconvert constants (getting the same value every time—who’da thunk it?), and look up variables[9] with linear searches through the symbol table. Each GOTO and GOSUB requires a linear search for the destination. That gives rise to all the articles that urge
to somewhat reduce interpreter overhead. There’s even a program that can further mangle your code, shortening it further but rendering it unlistable and uneditable.
In brief: Color BASIC rewards illegible, unmaintainable code. It’s easy to write trivial programs in Color BASIC, but it becomes your deadly enemy for programs of any size or significance. The Jargon File, which refers to BASIC as “the leading cause of brain damage in proto-hackers”, says: “A novice can write short BASIC programs (on the order of 10-20 lines) very easily; writing anything longer (a) is very painful, and (b) encourages bad habits that will make it harder to use more powerful languages well.”
Leo Rosten’s The Joys of Yiddish includes a politically incorrect joke appropriate to the way these articles suggest you mangle your code. Alas, I haven’t seen the book since around 1970, but here’s the gist of it:
A man goes to Moskowitz the tailor with cloth to have a suit made. Moskowitz takes his measurements and tells him to come back in a week, which he does… but when he tries the suit on, one arm proves to be longer than the other. “So hold up your shoulder, no one will notice.” The pant legs suffer a similar problem. “Nu, bend that knee some.” The jacket is shorter on one side. “Lean that way, and it will look fine.”
The customer leaves the tailor shop and does his best to walk with one shoulder up, one knee bent, and leaning to one side. A passer-by stops him: “Who is your tailor? I must see him!”
“You must see him?!”
“Anyone who can fit a kalikeh [cripple] like you must be a genius!”
The CoCo comes with Moskowitz the BASIC interpreter.
In contrast, basic09 does not keep the source code you enter; it regenerates it[10]. The legibility of the listing—indentation, capitalizing keywords, and so forth—imposes no perceptible cost aside from the size of the prettyprinter code, because it’s never done in execution mode—only when you, you slow human, are the limiting factor. Rather than rewarding mangled code, BASIC09 makes your code more readable at no cost in execution time, and uses a consistent layout[11].
Remember the entries in Color BASIC’s symbol table? Color BASIC accepts whatever you type for variable names, but it only uses their first two letters, so for example, MASS and MAX are the same variable. Have fun finding that!
OTOH, basic09 only keeps one copy of each variable name in a procedure, and you can safely use long variable names. How long? Well, this prints False, as it should[12]:
PROCEDURE Program |
Long names don’t matter; meaningful names do… but you can’t fit meaningful names into two characters.
Color BASIC only cares about the line it’s trying to execute. Start a multi-hour session entering an illegible listing from The RAINBOW with the line 10 A=B+*/C, and you’ll only see ?SN ERROR IN 10 when you run the code, well after you’ve forgotten any context. You get to fix them one at a time.
basic09, on the other hand, converts to I-code as soon as it can. It will immediately chastise you:
a:=b+*/c
^
Error #012 -- Illegal Statement Construction
and you can correct it right away. Some errors, e.g. unclosed control structures, can’t be found until you exit edit mode. basic09 will list them with the associated offset into the I-code, so you can tell where they happen.
basic09 listings show the position of the I-code for each line relative to the beginning of the procedure. The first one is thus always zero.
Color BASIC is like Bob’s Country Bunker in The Blues Brothers when it comes to data. “We have both kinds here—strings and floatin’ point!” BASIC09, on the other hand, has a variety of “atomic types''.
A BOOLEAN value is either TRUE or FALSE. If you PEEK under the lid[13]…
PROCEDURE Program |
...you’ll see
B:run
underneath, true = 255 false = 0
Note that BASIC09 generalizes DIM to be a way to declare variables, not just to dimension arrays. (I typed q:=NOT p, and basic09 added the parentheses. It considers NOT a function that takes a BOOLEAN and returns a BOOLEAN.)
BOOLEAN values are one byte long. A Sieve of Eratosthenes benchmark insisted on an array of 8000+ Boolean values. In BASIC09, it was no problem. In Color BASIC, an array of that many five-byte floating point numbers wouldn’t fit, and Color BASIC strings are at most 255 characters long. You could use an array of strings, but you’d have to do both subscripting and string indexing; not a pretty sight, and it would involve division, a slow operation.[14]
Some programmers seem to be in a weird Watergate-style purgatory when they use BOOLEAN. John Dean said “at this point in time” instead of “now” and “at that point in time” instead of “then”; these programmers write p=TRUE instead of p and p=FALSE instead of NOT(p). This is redundant and needlessly wastes space and time. Not even an epistemologist would say “If the truth value of the proposition ‘it doesn’t rain Tuesday after work’ is equal to true…”. Don’t do it in your code.
A BYTE is a one-byte unsigned integer and can hold integer values from 0 to 255. BASIC09 widens BYTE values to INTEGER for calculation, so they don’t have a speed advantage, but they do have a size advantage, and may be useful for matching an externally-defined data layout.
An INTEGER is a two-byte signed two’s complement integer, so it can hold integer values from -32768 to 32767. Integer calculations are far faster than floating point, and integers are less than half the size of floating point. Color BASIC can only use floating point for subscripting and FOR loops, but BASIC09 lets you choose. The payoff? A Color BASIC program with an empty 32,768 iteration FOR loop runs in sixty seconds[15]. A BASIC09 program with an empty 32,768 iteration integer FOR loop runs under OS-9 Level 2 in three seconds. OS-9 Level 2 runs at 1.79 MHz on the CoCo 3, so that’s really “only” ten times as fast.
We all know that floating point numbers are not the real numbers[16]… but that’s what BASIC09 calls its floating point type. It’s the same size as Color BASIC floating point numbers: five bytes, with a one-byte exponent and a four-byte mantissa. Unlike Color BASIC, where the exponent is biased by 128, BASIC09 REALs have an unbiased exponent. Unlike IEEE 754, BASIC09 REALs don’t omit the most significant mantissa bit in non-zero numbers to sneak a little more precision.
Strings in Color BASIC can be at most 255 bytes long. String variables in BASIC09 can be longer, limited only by the RAM basic09 has free. The catch: you must specify their maximum length, e.g. DIM name:STRING[16], or let it default to a maximum length of 32. This does have advantages: no string descriptors needed; also, only string expressions might require anything resembling a heap—once they’re assigned to a variable or the PROCEDURE they’re passed to finishes execution, they can go away (so they can be set up on the stack).
How are strings terminated? If their length is less than the declared maximum, or if they’re string constants, there is a marker. Let’s see what it is.
PROCEDURE Program |
The output is 255. That does mean you can’t have a string containing a byte with 255; it’s taken as the end. That is one disadvantage of BASIC09 strings versus Color BASIC strings… but check out the BASIC09 Seminar At CoCoFest 2022 video. Watch it all, of course, but note the bit from about 59:55 on. BASIC09 variables have a sort of descriptor, too, and back in the day, Chris Dekker noticed that for STRINGs, there’s enough free space there to store the current length. Keep it up to date and you can seriously speed up some operations, e.g. concatenation and LEN() (not to mention center alignment for PRINT USING). That’s one of several optimizations that may make it into basic09 and runb… and it would allow CHR$(255).
BASIC09 and Color BASIC allow arrays of up to three dimensions, but you may not know that in Color BASIC, subscripts start with zero, so there DIM X(3,4,5) creates an array with 120 elements (4 ✕ 5 ✕ 6), twice as many as you may think. In BASIC09, DIM x(3,4,5):REAL creates an array with 60 elements, but subscripts can either start at 1 (by default, or with an explicit BASE 1) or zero (with an explicit BASE 0). BASE statements apply to the procedures they are in, and to all arrays in the procedures they are in. Another difference: BASIC09 insists on constants for array bounds. That means that any “stack frame” (the data that corresponds to one invocation of a particular procedure) has a size known at “compile time”, but not being able to get and use the bounds of array parameters is among the Pascal issues that Kernighan rightly complained about[17].
By the way… in BASIC09, you can have arrays of atomic types and arrays of TYPEs. What’s a TYPE? I’m so glad you asked.
Here things get interesting. Color BASIC, like most languages of the early 1960s when BASIC began, had no notion of what in Algol 68, C, and Go are structs, or if you want to use the formal term, product types. In BASIC09, they’re TYPEs, e.g.
TYPE PLAYER=name:STRING[32];pot:REAL;hand:CARDS |
A PLAYER has a name, which is a string of up to 32 characters, a pot, which is a floating point number, and a hand, which is a, uh… wait. There is no atomic type called CARDS… but if there’s an existing TYPE that defines CARDS, CARDS can in turn appear in another TYPE. How to represent a player’s hand in a card game depends on the game. For a poker game, I used an array of sets of suits[18] indexed by card rank. name, pot, and hand are the fields of a value of type PLAYER.
Once you define a PLAYER type, you can then do something like
DIM computer,human:PLAYER |
or even
DIM players(4):PLAYER |
TYPEs and PROCEDUREs let you write in terms of abstract data types. It’s not perfect. Just as C++ headers let you see all the details of the classes declared therein, a PROCEDURE can only use TYPEs whose definitions it has seen and thus contains. Ignore the internals; just use the PROCEDUREs that define its behavior. More about this later.
Back in the mid-70s, FORTRAN 66 was in common use[19], but the structured programming war was being won. Ratfor and Mortran turned code with reasonable control structures into FORTRAN[20]… but what about data? FORTRAN only had numeric and “logical” (Boolean) types, and arrays thereof; sounds familiar, eh?
Faking structs involved one array per struct member, so a struct was all the relevant array elements with the same subscript. You can do it in Color BASIC… but isn’t player.hand more intelligible than HA(2)? If a hand is itself a structure, things get even more perverse.[21]
“You said you were going to skip things we Color BASIC programmers already know!” Yes, I did, and indeed, all non-trivial Color BASIC programs have assignment statements… but BASIC09’s richer type system makes assignments more interesting.
As mentioned earlier, you can use either := or = for assignment. The choice doesn’t affect code size or speed, so I urge you to use := for clarity. You can write p=q=r to assign a BOOLEAN p either true or false depending on whether q=r, but p:=q=r is a lot clearer.
As a Color BASIC programmer, you know what assignment does. You can assign a floating-point value to a floating-point variable or array element, or a string value to a string variable or array element.
In BASIC09, you can assign a value of type t1 to a variable, parameter, array element, or field (we’ll borrow a C buzzword and call it an lvalue) of numeric type t2 where t1 and t2 are compatible. “Great,” you say. “What types are compatible?”
That, alas, is an oversight in the BASIC09 manual. Find a searchable copy online and look for every instance of “compatible”; you’ll see it in the context of assignment and of comparison, but nothing saying just which types are compatible. Surely identical types are compatible, but… experimentation shows:
Then things really get interesting. You can assign an array, or a parameter or variable of some TYPE type, to another array or parameter or variable of some TYPE type (or field in a parameter or variable of some TYPE type that’s an array or a TYPE type, or...). In this case, BASIC09 requires only that the destination be at least as big as the source.
The second kind of assignment may evade strong typing, but can have some advantages:
BASIC09 lacks one sort of assignment that I suspect Color BASIC borrowed from PL/I: MID$(STRING_variable, start, length) = STRING_expression. Sometimes it’s very convenient.
One wouldn’t think there could be much difference in plain old PRINT… but one would be wrong.
Color BASIC only has one numeric data type, five-byte floating point. If there’s documentation of how they’re printed, I’ve yet to find it. I’ll take a stab at reading Color BASIC Unravelled. For now, experimentation shows:
A semicolon separating items in a PRINT statement in Color BASIC has different results, depending on the types of the expressions preceding and following it:
BASIC09 never outputs a space between expressions separated by a semicolon.
PRINT USING has the same syntax in BASIC09 as in Color BASIC, i.e.
PRINT [#path,] USING STRING-expression, expression (, expression)*
but the permissible format string values (the STRING-expression above) are very different.
The differences justify more information and some examples.
There are two flavors of format specifications: those that print one of the expressions following the format string, and those that don’t.
There are three that don’t, called “control specifications”:
Format specifications that correspond to expressions to be printed start with letter width and end with [justification] (except that the B and S specifications will let you omit width with the same effect as if you’d specified a width of one). justification, if present, is either < (left justified), ^ (centered), or > (right justified). If omitted, it’s taken to be < for B and S, > for numeric specifications. The possible letter values are R (REAL decimal), E (REAL e-format), I (INTEGER), H (hexadecimal), S (STRING), and B (BOOLEAN). (Lower case will work, too.)
A “repeat group” in a format string has the form number ( specification [, specification]*). It can save space in a format string, having the same effect as repeating the listed specifications number times. The BASIC09 manual says repeat groups can be nested, but “2(2(i4))” gave me the aforementioned error 63 at run time.
We’ve alluded to control flow earlier and pointed out that some BASIC09 control flow constructs require line numbers. Color BASIC users already know them, so I’ll just say don’t use them, save that
Aside from those, all BASIC09 statements that affect control flow have matching beginning and ending statements: FOR/NEXT, IF/ENDIF, REPEAT/UNTIL, WHILE/ENDWHILE, LOOP/ENDLOOP, and EXITIF/ENDEXIT. basic09 will insist on exactly one ending statement corresponding to each beginning statement, and they must nest properly—for example, you can’t have a FOR loop containing an IF statement whose matching ENDIF is past the FOR’s NEXT statement.
“But FOR and NEXT have to match in Color BASIC,” you say? Take a look at this, which ran and did what I expected in Color BASIC on MAME:
100 FOR I=1 TO 10 |
This is just another case of the Color BASIC parser being, shall we say, “flexible” about what it is willing to accept. That said, on to the new stuff!
Color BASIC does have IF THEN ELSE, but it must all be on one line. I don’t know whether Color BASIC has a “dangling ELSE” problem[22].
BASIC09 has no dangling ELSE problem. A multi-line IF must end with ENDIF; for example,
PROCEDURE Quadratic |
(SQ(b) is a faster alternative to b^2.) The official syntax, as you’ll recall, is
IF BOOLEAN-expression THEN |
Sadly, BASIC09 doesn’t have an ELIF like Algol 68 or Python, so IF/ELSE IF chains creep to the right hand side of the screen or page. Avoid deeply nested IF statements.
BASIC09 and Color BASIC have one sort of loop in common… sort of.
Everyone knows FOR loops, right? As the Reduced Shakespeare Company would say to Ophelia, “Maybe… maybe not.” What does this Color BASIC FOR loop do?
FOR I=2000 TO 1: PRINT I: NEXT |
It will print a line reading 2000, of course! Color BASIC doesn’t check the control variable’s first value. BASIC09, on the other hand, does check it (as specified, by the way, in an early BASIC manual from Dartmouth). If you want to port Color BASIC code that, heaven forbid, depends on that behavior, convert the unfortunate FOR loops to use REPEAT.
Other differences:
There are more kinds of loops than are dreamt of in your philosophy, Color BASIC. You can kludge them with GOTOs, but a language supporting them lets you write clearer, more maintainable code. (Faster, too. Remember those linear searches GOTOs cause in Color BASIC?)
A WHILE loop tests at the top. Its syntax is
WHILE BOOLEAN-expression DO |
Here’s the Color BASIC equivalent, assuming all the statements will fit on a line. Use whatever line number makes sense in your code. We picked 100 just to have a specific value.
100 IF expression THEN statement*:GOTO 100 |
REPEAT tests at the bottom. Its syntax is
REPEAT |
The Color BASIC equivalent, with the same caveats[23]:
100 statement*: IF NOT expression THEN 100 |
Who wants to loop forever? A server or a cron program, perhaps, and LOOP lets you write it clearly. Here’s a BASIC09 version of the Unix/Linux yes command:
PROCEDURE yes |
That’s not the whole story, though. You can escape a LOOP, or any BASIC09 loop, with...
EXITIF can get you out of any loop. It has the form
EXITIF BOOLEAN-expression THEN |
If the Boolean expression is true, basic09 will execute the statements and then get out of the deepest loop that contains the EXITIF. If you just want out, one line suffices for
EXITIF BOOLEAN-expression THEN ENDEXIT |
Why would you want EXITIF?
Let’s give an example. Remember that sieve of Eratosthenes? Here’s a BASIC09 version based on the BYTE magazine benchmark:
PROCEDURE sieve |
On a CoCo 3, this runs in about 14.5 seconds… but it’s doing more work than it should. Eventually there will be a prime such that the first value of k is bigger than 8190. Any primes bigger than that will have the same property, so once you hit that point, all the composites in this bounded sieve have already been marked out. From there you can just count how many true flags are left. The simplest way to do that is with EXITIF:
PROCEDURE sieve EXITIF k>8190 THEN |
Alas, it only shortens the run time by about a second.
Here we go again. What can be different about arithmetic operators? More than you may realize.
Like Color BASIC, BASIC09 has built-in functions. As a Color BASIC programmer, you also know that such functions only work on arguments of particular types; SQR(“Belgian waffle”) will fail.
If a Color BASIC or BASIC09 program executes the statement GOSUB 1000, it
If a BASIC09 program executes the statement RUN foo, it
There’s one major difference, though.
That restriction is important and useful. In security and computer science it’s called the principle of least privilege: limit everything to only what it needs for its legitimate purpose. If anything can change anything, just knowing what is corrupted tells you nothing about the problem… and if you’ve had to debug a Color BASIC program of any size, you know this all too well.
With a subroutine, you have to set up some convention of assigning values to certain variables for the subroutine to work on and dedicate other variables to the subroutine’s use and returning results. A PROCEDURE invocation (RUN) lets you pass variables or expressions which are associated with the called PROCEDURE’s parameters for the duration of the invocation without needing explicit assignment. Parameters are specified with PARAM statements. They’re like DIM statements, but PARAM indicates that they come from the caller.
Each RUN (also known as a “call” or an “invocation”) of a procedure has a stack frame with
BASIC09 thus uses call by reference; a called procedure can modify passed parameters. Without true functions, that’s the only way to return a result.
If you pass an expression, it’s evaluated, saved, and its address passed. You can pass a copy of a variable you don’t want modified; RUN foo(i + 0, j, name + “”) will keep foo from changing i or name[26]. Variables of user-defined TYPEs need an explicit copy. Arrays? You’d have to put them in a TYPE.
With source code or good documentation, you can see which parameters get changed. Any that don’t, you needn’t copy… but unlike, say, Ada or Algol W, BASIC09 has no enforceable way to specify that. Kevin Darling gave modified parameters all upper case names and unmodified parameters all lower case names in his BASIC09 code, but BASIC09 capitalizes keywords and names of atomic types, so I suggest this variation:
So far, we’ve simply given the name of the procedure to run. Seems simple, makes perfect sense. What more could you want? A lot, actually.
One very powerful notion in programming is higher-order functions, functions that either take functions as parameters or return functions as a result. Sadly, BASIC09 doesn’t provide operations of the sort needed to actually generate a function, and you’d be right if you pointed out that BASIC09 doesn’t have a procedure type, so you can’t pass a procedure as a parameter.
If you look closely at the BASIC09 manual, though, you’ll notice that RUN has two flavors:
You can pass a STRING as a parameter, so you can get the same effect. With this we can (I hope!) write the adaptive quadrature procedure described later without having to wire in a particular procedure for the function to be integrated. Things to note:
We just mentioned one place where you can use a local variable but can’t use a parameter. There are others:
Back in the late 1960s, people were worried about what they called the “software crisis”. How can you get useful, efficient, correct, maintainable software done on time? Believe it or not, NATO held two conferences about how to solve it. They hoped to turn software into an engineering discipline, and did their best to push it in that direction. One of the important notions was breaking a program down into smaller, manageable pieces…
...but some decompositions are better than others. The classic paper on the subject is D.L. Parnas’s “On the Criteria To Be Used in Decomposing Systems into Modules” (Communications of the ACM, December 1972). Parnas sketches two ways to break down a KWIC (keyword in context) index generator into modules. The first one is typical for its time, and does work… but the second asks the question “what data structures would I need to make this easy to do?” and comes up with a decomposition that not only works, but is easier to change.
BASIC09 TYPEs and PROCEDUREs encourage the second way, as you’ll see when we solve the eight queens problem.
recursion, n. See recursion. —a very old computer science joke.
Recursion is an important technique; you should learn it (and how to avoid it if need be).
A PROCEDURE p is recursive if there’s a sequence of PROCEDUREs {p1, p2, …, pn} such that
Usually you’ll see “a procedure is recursive if it calls itself” (n=2), but it can be indirectly recursive (n > 2). Stack frames make recursion possible in BASIC09 (and other languages dating back to Algol 60).
To avoid a code version of the joke definition, a recursive procedure must break a problem down into at least two cases:
This may remind you of proofs by mathematical induction, and that’s typically how you show a recursive procedure works: by induction on some number that represents the size of the problem: how many elements are there in the array, which term of the sequence are you evaluating, or the depth of a tree.
There are some standard recursion examples, things like factorials or the Fibonacci sequence. You can write them recursively, but you can also write them with a simple loop. Alas, BASIC09 doesn’t do tail call optimization, so use a loop if you can, to avoid stack overflow. Here’s the recursive BASIC09 factorial, straight from the mathematical definition:
PROCEDURE factorial |
Here’s the iterative version:
PROCEDURE factorial |
In BASIC09, there’s no reason to write factorial recursively save for practice. Let’s look at examples that benefit from recursion.
The eight queens problem is another standard example of recursion. You have a standard eight-by-eight chessboard. How many ways can you put eight queens on the board such that none of them attack any of the others? (One can argue that a solution that’s a reflection or rotation of another solution isn’t really different, but we’ll ignore that here.)
A solution can only have one queen per file (column) and per rank (row), and no two can be on the same diagonal. So we can attack the problem this way:
What haven’t I mentioned? Right! I haven’t mentioned how the board is represented. In accordance with Parnas, all we’ve said is what you can do with a board:
Alas, PROCEDUREs can’t refer to some common file where shared TYPEs can just be written once. If they could, we could write the top level code in blissful ignorance of what a board looks like. So… we’ll work bottom-up, starting with the board. How to represent it?
You’re a Color BASIC programmer. You know about arrays, and you know about chess boards. A chess board is an eight-by-eight array, but an array of what? This is BASIC09, so we could use
The first one is needlessly general for the eight queens problem, so we’ll choose the second, giving
TYPE CHESSBOARD=occupied(8,8):BOOLEAN |
and with that choice made, we can write the top-level logic.
PROCEDURE eightQueens |
Now we need only write the board procedures.
clear is pretty obvious:
PROCEDURE clear |
Putting a queen on and taking it off, even more so[27]:
PROCEDURE putQueen |
This is a simple board printer (PRINT is a keyword, hence the name). You may wish to make it fancier.
PROCEDURE display IF col<8 THEN PRINT “|”; ENDIF IF row<8 THEN PRINT “-+-+-+-+-+-+-+-” ENDIF |
Here’s where things get complicated: finding out whether a position on the board is safe. We’re taking advantage of the problem and the top-level logic:
PROCEDURE trySquare |
This will run and generate the solutions, but it’s slow. trySquare has to go through a loop and check a number of squares that increases with the column number, with explicit checks to avoid subscript range errors. This version of the board isn’t suited to the eight queens problem.
The eight-by-eight array is too general. Even taking some advantage of what we know about the problem, trySquare is slow. The real question: what “board” has only the information we need for the problem, namely:
What do I mean by “up” and “down” for diagonals? It’s an arbitrary choice, but we’ll take which way they go as the column number increases to distinguish “up” from “down”. How many of each kind are there? Fifteen. (Sixteen would count the diagonal from one corner of the board to the other twice.) So that will give us
TYPE CHESSBOARD=rowCol(8),up(15),down(15):BOOLEAN |
...but if we still want to print the solutions, that’s a little too parsimonious. We’ll keep, for each row, either the column of that row a queen is in or zero if there is none. Oh… for the diagonals, does TRUE mean there’s a queen or not? Let’s say “not”, which gives us:
PROCEDURE clear Board.rowCol(row):=0 NEXT row Board.down(diag):=TRUE |
PROCEDURE putQueen
|
PROCEDURE trySquare
TYPE CHESSBOARD=rowCol(8):BYTE;up(15),down(15):BOOLEAN
PARAM board:CHESSBOARD;row,col:INTEGER;Safe:BOOLEAN
Safe:=board.rowCol(row)=0 AND board.up(row+col-1) AND board.down(8+row-col)
By the way… look at clear. Did you shudder when you saw it didn’t use the same indexing variable for both FOR loops? If you did, breathe… relax. This is BASIC09.
Given that, doesn’t it make sense to use two, each with a name suggesting what it represents?
This version of the board makes the board printing procedure much easier. It’s a little bit fancier now.
PROCEDURE display |
One of the meanings of “quadrature” is calculating an integral. Adaptive quadrature tries to calculate integrals and avoid unnecessary calculation.
If you’re like me, you first learned about the Riemann integral. Chop up the interval over which you’re integrating into smaller and smaller pieces, and if your function is sufficiently nice[28], the Riemann sums will converge to the actual integral. Actual numerical code won’t necessarily calculate the Riemann sums, but may try to approximate the function as if it were a line (trapezoidal rule) or a parabola (Simpson’s rule). The question, then, is, how much chopping do you need to do?
Of course, not all functions are of that sort. Let’s take a look at a function that Tom Rugg and Phil Feldman’s TRS-80 Color Computer Programs integrates with Simpson’s rule, f(x) = 4/(1+x^2) from 0 to 1. Here’s a graph of the function (ignore that bit on the left of the y-axis; obviously it’s symmetric about the y-axis):
It’s certainly not a straight line, but from about 0.25 to 1, it’s pretty close. That suggests that we can take advantage of those regions by not always dividing the interval up equally.
How to decide? If you’re integrating over [a, b], calculate the approximation using your rule of choice over [a, b]. Then pick c in (a, b) and calculate the approximation for [a, c] and [c, b], again using your rule of choice. If those two add up to a value reasonably close to the approximation for [a, b], call it good. Otherwise recur on each of [a, c] and [c, b].
This doesn’t always work. For example, f(x) = 40*exp(-x)*cos(40*x) from 0 to 2*pi, but you can take some countermeasures:
So, let’s write an adaptive quadrature procedure, using the trapezoidal rule. Why the trapezoidal rule? Because if the function is close to being a line on an interval [a, b], then drawing a line from (a, f(a)) to (b, f(b)) is close to plotting f over [a, b], and graphics routines typically support drawing lines. The only difference between an adaptive quadrature procedure and an adaptive function plotter is that the former adds values to a sum representing the integral and the latter draws lines, or rather line segments.
You already know one way–a tail recursive function like factorial, sum, or product (more generally, what functional programmers call folds) can be turned into a loop. In the general case, you roll your own call stack with an array of structures, each element holding what would be the local variables in a recursive version.
Color BASIC has statements specific to graphics and to making sounds. BASIC09 doesn’t:
OS-9 modules and the BASIC09 procedures they permit avoid the Scylla of rewriting the language system for every new target and the Charybdis of masses of unintelligible PEEKs and POKEs.
In OS-9, a graphics window is a device.
Programmers are fallible. Programs have bugs. BASIC09 has a debug mode and various statements designed to help you locate problems in your code.
We mentioned debug mode briefly when we listed the various modes that basic09 can be in. How do you enter debug mode?
You can tell when you’re in debug mode thanks to the D: prompt. What’s important about debug mode? In debug mode, one or more procedures are in the middle of running; they have stack frames corresponding to those procedures with all the information that implies, like local variables and where you are in the procedure. That gives you a chance to figure out what’s happening.
We mentioned the PAUSE statement as a way into debug mode. Here’s another debugging-related pair of statements: TRON and TROFF. (They fight for the programmers.) Color BASIC users will recognize them, but they’re a little different in BASIC09. Color BASIC in trace mode displays the line numbers of lines as they are executed. basic09 generates a listing of each line as it’s executed—there may not be a line number—along with the results of each expression in the statement. The trace mode setting is specific to each procedure invocation, like BASE or DEG and RAD.
As in system mode, entering $ followed by a command causes the command to be passed to a shell to run. The shell is a child of the basic09 process, so anything like chd or chx that only affects the shell process won’t affect basic09.
In debug mode you can type BREAK procedure_name where the name is that of a procedure that has a stack frame at the time you type the command, i.e. a run of the procedure is underway. Color BASIC has nothing like it that I’m aware of, but there are debuggers like gdb. If you’re familiar with them, BASIC09’s BREAK differs from their breakpoints in two respects:
How, you might ask, do you know which procedures have stack frames at the moment so you can BREAK in them?
STATE gives a “stack backtrace”, a listing of names of procedures with active stack frames from the deepest to the outermost. (Recursive procedures may appear more than once.)
As you’ve seen through the book, BASIC09 error messages have a numeric code and an accompanying short message, often along with the line where the problem appears and a caret pointing out where the error was detected. If the error is only seen when you leave edit mode, you’ll just see an offset rather than the line.
The messages, while short, explain the problem…usually…but sometimes it’s not clear how the message applies to the source code. Here you’ll see some counterintuitive examples of code and why we suspect that message appears. This is a work in progress.
[1] whom I should name here: Terry Ritter, then at Motorola, and Ken Kaplan, Larry Crane, and Robert Doggett, then at Microware.
[2] “Terminal symbols” actually end up in the source code. “Nonterminal symbols” get replaced by something else; could be the empty string (so they just go away), could be a sequence of terminal or nonterminal symbols. In a formal language, you start with a special nonterminal called the start symbol and keep replacing nonterminal symbols with something the grammar lets them be replaced with until all that’s left is terminal symbols (top down parsing), or you work your way back from a sequence of terminal symbols to the start symbol (bottom up parsing).
[3] It’s the retro version of the Wintel monopoly, just minus the “tel” and with BASIC instead of Windows. Microcomputers came with a “Microsoft tax” and, like Windows pre-installed on later computers, Microsoft BASIC put alternatives at a serious disadvantage, augmented by the then-high price of floppy drives and controllers.
[4] It doesn’t have to be that way. You can make OS-9 or NitrOS-9 immediately start basic09 rather than a shell, or if you use tsmon or mtsmon, a user can be sent immediately into basic09 after logging in.
[5] A procedure may be one that you have edited or loaded and haven’t packed, a packed BASIC09 procedure in a module, or even a module with code written in C or assembly. If it honors BASIC09 calling conventions and does nothing malicious, all is well… and a named assembly language procedure with explicit parameter passing is far more intelligible than assorted POKEs and EXEC magic_number.
[6] To be sure, it’s nothing like today’s IDEs: it won’t fill in code for you, suggest what you should have typed, or highlight all uses of a variable… but it’s about as good as it gets in a 64K address space.
[7] RPN=Reverse Polish Notation; 2 * (3 + 4) in RPN is 2 3 4 + *. “Polish” honors the nationality of Jan Łukasiewicz, its inventor, and is easier for non-Poles to spell; “reverse” because operators come after their operands rather than before. It’s trivial to evaluate with a stack: push values, and for each operation, pop the number of operands it needs, perform the operation, and push the result. For more information: Expression Evaluation.
[8] That’s a somewhat hairy task involving parentheses, operator precedence, and associativity.
[9] There’s an exception for FOR loop control variables, but only for the loop increment and test.
[10] Numeric constants, especially floating point ones, may not quite match what you typed, and infix expressions are regenerated from the RPN, so they won’t have superfluous parentheses you may have typed.
[11] Some programming languages now impose or strongly encourage a standard layout (e.g. Go, three decades after BASIC09) for consistency, avoiding the many gratuitous, pointlessly-different styles of, for example, C.
[12] Backslash is BASIC09’s colon; using it or not makes no difference in size or speed, just in layout.
[13] What’s :=? It’s a symbol for assignment from Algol 60, later used in Algol 68 and Pascal. BASIC09 lets one use either := or = for assignment. There’s no difference in code size or speed, but := makes the intent clear.
[14] That’s not quite true. See BASIC09 vs. Color BASIC: Sieve of Eratosthenes for details.
[15] These times are measured using MAME emulating a CoCo 3 with a 6309. In my experience, MAME does a good job of running at the speed of the original hardware. David Ahl’s Creative Computing BASIC benchmark in Color BASIC took as long under MAME as a real CoCo took in the original article on the benchmark.
[16] You do, right? If not, read What Every Programmer Should Know About Floating-Point Arithmetic.
[17] Kernighan, Brian W., “Why Pascal is not My Favorite Programming Language”.
[18] BASIC09 doesn’t have a set type. There are just four suits in a card deck, so a BYTE did the trick.
[19] “I don’t know what the language of the year 2000 will look like, but it will be called Fortran.” —C.A.R. Hoare
[20] ...and had exactly the problem that current Color BASIC preprocessors have today: they leave you reading error messages for, and later debugging, the not-necessarily-legible preprocessor output.
[21] Honesty compels me to say that these days you sometimes see this done (see AoS and SoA) when the common operations on a data structure iterate over all instances of a particular field, to minimize cache misses.
[22] Back in the day, my motto was “The only Color BASIC I use is DOS”. I’d have been delighted to avoid even that.
[23] Actually, you can torture a Color BASIC FOR loop into a REPEAT loop. Good luck reminding yourself of the trick six months later.
[24] We’re using sloppy language here. INT() doesn’t return an INTEGER; it returns a REAL whose value, if the argument passed is in range, corresponds to an INTEGER. In math, there’s a subset of the reals that is isomorphic to the integers–if you add, subtract, or multiply the reals corresponding to the integers m and n, you get the real corresponding to m+n, m-n, and mn respectively.
[25] Why not make AND, OR, and NOT polymorphic too? (<> is XOR for BOOLEANs.) I don’t know. Maybe BASIC09’s designers thought expressions mixing BOOLEAN and bitwise operations would be confusing if they did.
[26] BASIC09 saves copies of constants automatically. This avoids an infamous problem in old FORTRAN, which used call by reference and also kept a “constant pool”. If you passed a constant, say 2, to a FORTRAN subroutine that changed the corresponding parameter, suddenly 2 could be any number.
[27] For a serious application, one would check for errors, e.g. row or column out of range, but we’re just trying to demonstrate recursion and how to break a problem down.
[28] The Dirichlet function, which maps rational numbers to one and irrationals to zero, is not sufficiently nice for Riemann integration. It needs Lebesgue integration.