Mimsy Were the Borogoves

Hacks: Articles about programming in Python, Perl, Swift, BASIC, and whatever else I happen to feel like hacking at.

The Dream of Poor Bazin

Jerry Stratton

What if the Three Musketeers were journalists in Washington, DC? What if journalists were swashbuckling, swaggering, hard-drinking warriors of truth? Find out in Jerry Stratton’s The Dream of Poor Bazin.

Convert PCBASIC code to TRS-80 Extended Color BASIC

Jerry Stratton, June 10, 2020

Tim Hartnell’s Reversi

One of the nice things about the older strategy games is that it’s easier to win…

I recently found Tim Hartnell’s really nice Giant Book of Computer Games. Hartnell strikes exactly the tone I was looking for with 42 Astounding Scripts. Breezy, informative, and filled with useful code.

While he writes his book for “Microsoft BASIC on an IBM PC” he specifically avoids anything that might make it unusable on other computers, “no PEEKs and POKEs, no use of graphic character sets, and no use of such commands as SOUND or PLAY. I’ve assumed you have access to READ and DATA, and that your screen is around 32 to 40 characters wide”.

This makes it very suitable for customizing for any of the old 8-bit computers such as the TRS-80 Color Computer line. The changes that need to be made to get these programs to run on the CoCo are somewhat rote, and mostly easily automated. So of course I wrote a script for that (Zip file, 6.1 KB).

  1. On the IBM PC, PRINT TAB started from 1, and printed at that character. In Extended Color BASIC, PRINT TAB starts from 0, and moves that many characters over. So the script reduces PRINT TABs by one. This is especially important for board games like reversi. You can also use --shift-tabs xx to further shift any PRINT TAB statement to the left. For example, Hartnell’s chess program prints at tab 9, which pushes the chess board just barely over the edge of the 32-character CoCo screen.
  2. The IBM PC uses a bare RND function to generate a random number between 0 and 1. Extended Color BASIC uses RND(0) for the same. Most of the time, Hartnell uses RND to generate an integer, by multiplying the real number by a whole number and taking the INT of that. The format and variations are complex enough that I chose not to try and convert that to ECB’s cleaner RND(x) function. Hartnell’s random numbers usually start at zero, whereas the RND(x) function starts at 1.
  3. The IBM PC has a RANDOMIZE statement to set the random number generator’s seed. Extended Color BASIC uses the RND function to do the same thing, by using negative numbers. Where the number Hartnell sends to RANDOMIZE is easy to discern, the script uses the same number in RND(x), but negative. Where the number is not easy to discern, the script discards it and uses RND(-TIMER).
  4. The IBM PC can RESTORE to a specific DATA line, using, for example, RESTORE 3270 so that the next READ starts with the data in line 3270. The script emulates this in Extended Color BASIC by first doing a RESTORE and then reading up to but not including the data in the specified line. This is a lot slower, but it does work.
  5. The IBM PC can use BASIC function names at the start of a variable name. Extended Color BASIC cannot. The script attempts to detect bad variable names and replace them. Fortunately, using only a function name as a variable name is illegal in PC BASIC as well.
  6. The Color Computer sees everything following a DATA statement as data, even if there’s a colon on the line. This mainly seems to be a problem with REM statements, an easily solvable problem, since they can be safely removed.
Tim Hartnell’s Mistress of Xenophobia

Plunder the planet Xenophobia as much as you can in your 20-year reign. The only way to plunder it is to convince the farmers to produce more crops. Your input is merely the acreage of farmland and the food to feed your groveling peasants.

Since the Color Computer’s screen is 32 characters wide, I chose to make an attempt to reformat PRINT statements to 32 characters maximum. It works surprisingly (to me, at least) well. The script handles these in two different ways. For PRINT statements that are each on their own line, it collects the statements and then recreates them in a series of single-digit incremented lines. If they’re followed by an INPUT statement, the string from the INPUT statement is also merged into the PRINT statements.

For PRINT statements that are repeated on the same line, the script combines the strings from each of the PRINT statements into a single PRINT. If the resulting string is more than 32 characters the script takes the string and generates multiple PRINT statements in its place.

Further, since the Color Computer has far fewer vertical lines than the PC did, I have the script discard repeated empty PRINT statements. They serve little purpose on the CoCo except to scroll important information off the screen.

And finally, if there are still more than 14 sequential PRINT statements, it adds a pause subroutine at the end, and warns you about it, calling the pause routine before the number of PRINT statements exceeds 14.

As you can see from the Mistress of Xenophobia screenshot, it’s far from perfect. Any PRINT line that contains variables is impossible to predict the size of, so in Mistress the “grovelling peasants” line is separate from the “some xx acres” line. In real life, of course, I’d go in and manually merge those two lines. In real real life I liked this game so much I rewrote it in superBASIC. The screenshot here is from the auto-converted version, not the superBASIC version.

One change requires a warning. I was somewhat surprised to find that Extended Color BASIC has no DEFINT statement to mark certain variables as integers instead of floating-point numbers. So the script removes that statement entirely and issues a warning that you may need to verify that all of the listed variables are used as integers. It could be a problem if the code has a line like A=B/2 and the program is expecting that A will always be an integer.

Original CodeCoCo Translation
  • 10 REM THE DUKE OF DRAGONFEAR
  • 20 GOSUB 1280
  • 30 CLS:PRINT:PRINT:PRINT
  • 40 GOSUB 1200
  • 50 Q=INT(RND*7)
  • 60 IF Q=0 AND E<>55 THEN GOSUB 1200
  • 70 CLS:PRINT:PRINT:PRINT "DUKE "A$", YOU ARE IN CAVE"E
  • 80 IF G>0 THEN PRINT "YOU ARE CARRYING $"G"WORTH OF GOLD"
  • 90 GOSUB 760
  • 100 PRINT:PRINT "YOU HAVE"25-H"UNITS OF CHARISMA LEFT"
  • 110 PRINT:PRINT "what do you want to do?"
  • 120 PRINT "N - move north, S - move south"
  • 130 PRINT "E - move east, W - move west"
  • 140 PRINT "F - fight a dragon, Q - quit":PRINT
  • 10 REM THE DUKE OF DRAGONFEAR
  • 20 GOSUB 1280
  • 30 CLS
  • 40 GOSUB 1200
  • 50 Q=INT(RND(0)*7)
  • 60 IF Q=0 AND E<>55 THEN GOSUB 1200
  • 70 CLS:PRINT "DUKE "A$", YOU ARE IN CAVE"E
  • 80 IF G>0 THEN PRINT "YOU ARE CARRYING $"G"WORTH OF GOLD"
  • 90 GOSUB 760
  • 100 PRINT:PRINT "YOU HAVE"25-H"UNITS OF CHARISMA LEFT"
  • 110 PRINT "WHAT DO YOU WANT TO DO? N - MOVE";
  • 111 PRINT "NORTH, S - MOVE SOUTH E - MOVE"
  • 112 PRINT "EAST, W - MOVE WEST"
  • 140 PRINT "F - FIGHT A DRAGON, Q - QUIT":PRINT
  • 10 REM MISTRESS OF XENOPHOBIA
  • 20 GOSUB 700:REM INITIALISE
  • 30 REM ****************************
  • 40 FOR Y=1 TO 20
  • 50 CLS
  • 60 PRINT:PRINT:PRINT
  • 70 PRINT "Mistress of Xenophobia, a report for"
  • 80 PRINT "you from the Office of Information"
  • 90 PRINT "regarding the state of your planet"
  • 100 PRINT "in this year of Grace,"1994 + Y
  • 110 PRINT:PRINT
  • 120 PRINT "The planet's population is"INT(P)
  • 130 GOSUB 880
  • 10 REM MISTRESS OF XENOPHOBIA
  • 20 GOSUB 700:REM INITIALISE
  • 30 REM ****************************
  • 40 FOR Y=1 TO 20
  • 50 CLS
  • 60 REM REMOVED FULL LINE
  • 70 PRINT "MISTRESS OF XENOPHOBIA, A REPORT";
  • 71 PRINT "FOR YOU FROM THE OFFICE OF"
  • 72 PRINT "INFORMATION REGARDING THE STATE"
  • 73 PRINT "OF YOUR PLANET"
  • 100 PRINT "IN THIS YEAR OF GRACE,"1994 + Y
  • 110 PRINT
  • 120 PRINT "THE PLANET'S POPULATION IS"INT(P)
  • 130 GOSUB 880

In The Duke of Dragonfear, the PRINT statements following the CLS are removed; the RND statement is changed to RND(0), and the menu of options is shifted around a bit without doing much. In Mistress of Xenophobia, the three PRINT statements under the CLS are removed, leaving the line empty, and the opening instructions are reformatted to fit on the Color Computer’s screen—otherwise, it would have wrapped in an ugly manner toward the end of the first line. And various repeated empty PRINT statements are collapsed into a single empty PRINT.

One of the more interesting features of this program is the --compress option. This attempts to remove all unnecessary code, lines, and spaces, as well as abbreviate text. Hartnell’s “Chateau Gaillard” from Creating Adventure Games on Your Computer is 29,520 bytes ASCII, and 24,932 bytes tokenized. This, combined with the strings it creates, is too big to run in 32k on the Color Computer.1 Compressed, however, it drops to 23,489 bytes ASCII and 19,626 bytes tokenized. This is a more than 20% reduction in memory requirements. That’s just barely enough to let it run.

The compress option has three levels. All of the code changes happen at level 1.

  • Remove REM text. If the REM statement is at the end of a line of code, it removes the REM statement as well. If the REM statement is on its own line, it only removes the statement if no THEN, GOTO, or GOSUB references that line.2
  • Combine DATA lines, up to 255 characters per line. Also, remove zeroes from numeric DATA lines. An empty data element gets read as zero, so this saves a byte.
  • Combine multiple lines into one unless it detects that (a) this would cause code to never be executed (such as following an IF line) or (b) the line is referenced elsewhere.
  • Remove spaces from around statements and functions. In most cases, spaces are unnecessary. Spaces are necessary if a numeric variable that ends in a letter is followed by a statement; this should only be the case with THEN, as in IF A=I THEN GOTO 500. This can be compressed to IFA=I THENGOTO500, but that one space is necessary to keep it from looking like the variable ITHEN to the BASIC interpreter.
  • Convert THEN GOTO and THEN GOSUB to just GOTO and GOSUB.
  • At level two, it will abbreviate text, by combining NOT and HAVE into contractions, removing multiple spaces, getting rid of spaces around dashes, converting ellipses into two periods, and changing AND to the ampersand.
  • At level three, it gets a bit more creative with contracting and abbreviating text. Any word followed by ARE is contracted to the word plus ’RE. The same is done with words followed by HAS or IS. Some longer words are converted to shorter words with similar colloquial meanings; currently, that’s limited to ALTHOUGH being changed to THOUGH and CONGRATULATIONS to CONGRATS. This is mainly because I have so far only used this feature with Chateau Gaillard. It’s the final and longest program in Creating Adventure Games.

Here’s how the various levels help compress CHATEAU.BAS:

LevelBytesTokenizedReduction
029,52024,932
123,86719,97619%
223,60219,73520%
323,48919,62621%

It may not seem like much of a difference between level 1 and 3, but it was enough to get Chateau Gaillard to successfully run. By default, the script uses compression level 1 when you specify compression on the command line. If you don’t specify compression, it performs none.3

With what I learned from implementing compression here, I’ll probably add a similar option to superBASIC when it becomes necessary. Or more likely, write a separate script that superBASIC’s output can be piped to or that can be given any text file as an option.

Remember that you can also gain 4,608 bytes by typing PCLEAR 1 before loading a program.4

The original reason I wrote this script was to do some simple error checking after typing in Hartnell’s BASIC programs. It performs a lot of useful checks to catch common and computer-recognizable typos. If you want that feature and not the CoCo conversion, add --nococo to the command line.

The very first error check I added was checking that line numbers were in increments of 10. All of Hartnell’s code in this book uses increments of 10, exactly, throughout each program. If you want to disable that check, add the --uneven-lines switch. I often do this after I’ve typed in the code, and start adding my own modifications. Such as in The Bannochburn Legacy where Hartnell makes the very strange decision to check for the starting letters of “fight” and “flee” but requires entering single letters for going north, south, etc.5

Other checks are:

  1. Is the line too long? This mostly occurs after forgetting to hit return at the end of a line, since PC BASIC has the same 255-character limitation as ECB.
  2. Has the same line number already occurred? Most likely a typo.
  3. Is there an odd number of quotes? While PC BASIC supports leaving the last quote off of a line, I have seen this so rarely in Tim Hartnell’s code that it looks like a typo, and I fix it.
  4. Are there mismatched parentheses? If the number of open and close parentheses don’t match, that’s likely a typo.
  5. Does an assignment follow a comma, semicolon, or quote rather than a colon? That’s almost certainly a typo.
  6. Does an assignment, not at the beginning of a line, not have a colon in front of it? Also almost certainly a typo.
  7. Does the line number referenced in a GOSUB, GOTO, or THEN exist? This is either a typo here, or in the referenced line.
  8. Does a statement following a quote not have a colon in front of it? For some reason, I often leave the colon out after a close quote and before the next statement.

I find it easiest to test the code in Rob Hageman’s PC-BASIC before converting it to Extended Color BASIC. After converting it to ECB, I test it in XRoar before transferring it to the Color Computer. That’s also where I get the screenshots.

COST is a Forbidden Word

Oh, crap. That is not OK at all.

This script highlights the benefits of just-in-time programming. If I’d followed the advice of, say, Paul Nagin in BASIC with Style, and attempted to first formulate “a complete understanding of the problem and a complete general plan of attack”, I would probably never have written this script. The problems would have seemed insurmountable. As it was, my initial reaction on discovering that there are no reserved words6 in PC BASIC, causing syntax errors in ECB, was to simply give up with a fatal error on detecting them. But since I had already solved the problem of separating quoted text from BASIC code, I didn’t have to worry about avoiding replacing, say, the word COST when it was in quoted text. There are almost certainly still killer issues to be dealt with; but in the meantime I’ve been able to have a lot of fun playing games I would otherwise have had to painstakingly convert by hand.

There’s no reason why this script couldn’t be used for other books that use PCBASIC, or similar BASIC variants, as their lingua franca. As long as there are no conflicting conversions, the conversions can be added to the statementConversions subroutine. The script sends every statement, without any quoted text, to that subroutine after breaking lines apart on colons.

In response to TRS-80 Color Computer Programming Tools: The TRS-80 Color Computer was a fascinating implementation of the 6809 computer chip, and was, with from the Color Computer 1 through 3, possibly the longest-running of the old-school personal computers.

  1. I have a feeling it’s mainly too big to run in Disk Extended Color BASIC. Also, I suspect that if I’d known about PCLEAR when I typed this program in, I wouldn’t have needed some of the uglier text abbreviations.

  2. Interestingly, to me at least, while an apostrophe in place of REM takes up much less space in ASCII, it takes up one extra byte per use after tokenization. Tokenizing CHATEAU.BAS using decb -b -t created a 19,976-byte file with REMs and a 20,034-byte file with apostrophes. Similarly, PRINT MEM in Xroar returned 2,859 bytes free with REMs and only 2801 bytes free with apostrophes.

    In both cases, that’s a difference of 58, which is exactly the number of lines that had a REM statement but couldn’t be deleted (at least not without updating all the references, which may be workable) due to being referenced elsewhere.

  3. At the end of the process, I also saved 115 bytes by using RENUM 1,20,1 to renumber the lines in increments of 1, starting at 1. My Color Computer has 22,973 bytes free after a NEW. After loading the fully-compressed “Chateau Gaillard” it had 3,345 bytes free. And after renumbering the lines, it had 3,460 bytes free.

    When working in really tight memory, the amount of memory cleared for string space also matters. CLEAR 100 before running the program, or not clearing any memory, resulted in an OM (Out of memory) error; CLEAR 50 seems to work, and clearing something less than 50 results in an OS (Out of string space) error. Amazing how little string variables these programs used.

    I also discovered that putting a CLEAR in a subroutine clears the return stack as well, resulting in an RG (RETURN without GOSUB) error.

  4. For reasons I don’t understand, PCLEAR 1 fails with an FC error sometimes, if executed after loading, which means it can’t be put in as the first line of some programs. For the Chateau, PCLEAR 2 still saved 3072 bytes, which was more than enough to run after compression level 2.

  5. I see what he’s doing there, sort of. He chose the words “fight” and “flee” for those actions, in order to make the semi-combat actions start with the same letter. But it has the effect of allowing for typing FIGHT and FLEE but not SOUTH or NORTH, or MAGIC or STRENGTH.

  6. Or at least fewer reserved words. Even in PC BASIC, you can’t use a function name for a variable name. The problem is that in ECB you can’t start a variable name with a function name; so in PC BASIC COS is illegal but COST is legal. In ECB, both are illegal.

    This has the benefit that no PC BASIC program should use a function name for a variable name—except that there are ECB reserved names that don’t exist in PC BASIC—in Hartnell’s Chateau Gaillard, he uses AS as a numeric variable. This is a reserved word in ECB (in Disk BASIC, to be precise). If you run into one of those, there’s not much you can do except change them by hand. Automatic replacement would run the risk of changing legitimate uses of the function.

  1. <- superBASIC manual
  2. Reversi ->