Limitations of 8-bit Turbo Pascal 3.0, case study

Background

In the previous post, I’ve acknowledged that Pascal emerged in the 70s, and Turbo Pascal existed even back in the 80s, which in theory should change programmers’ life quality if compared to BASIC (this is an important assumption).
But it’s also important to remember that every language evolves, and even if we praise Pascal for its contributions to programming in general, and if we acknowledge that it is a verbose, type-strict modern object-oriented programming language, it naturally hasn’t always been.
In this post, you can expect some notes on: what feels weird when switching to Pascal today, and how porting a small program to an actual 8-bit OS went.

I am using Advent Of Code 2023, a programming challenge that’s not as hardcore as some can be (fairly small input, calculation time doesn’t matter), as a good battleground for checking specific versions of programming languages, as well as reminding myself why I’m glad computers have more memory today.

I started with a simple implementation that I could write on a linux PC and compile with Free Pascal, while sticking to what I believed to be the fundamental Pascal features that should also compile well with an older version – so no class, no libraries, simple program construction.

The first 3 days I have implemented in a subset BBC BASIC (although tested on a Mac), so the Pascal challenge fell on Day 4 task (yes, well overdue). I’ve managed to get a simplistic solution (no clever tricks) for it that compiles, runs, and produces a correct result for both parts of Day 4 assignment: Free Pascal version 04.PAS.

Notes on coding in Pascal in 2024

I’ll stick to term “Pascal” here, so using any features provided by FreePascal cannot be assumed.

Once you switch to it as your target language from some of the most mainstream ones (such as Golang, Python, TypeScript, C, C++, Java), you will notice a few painful setbacks.

There’s no full-featured support in your favorite editors. No reasonable plugins for JetBrains IDEs (as of January 2024), while for VS Code you have at best some popular but very much unofficial plugins¹.

There’s Lazarus, but I’m not a fan of the multi-window interface it copied from Delphi 7. It’s something that worked much better if one had one monitor, with 1600×1200 resolution, and a single virtual desktop.

There’s also fp, which is the command-line editor from Free Pascal Compiler, but it isn’t part of the fpc install on all platforms – does not support Mac. On some platforms, it welcomes you with an ascii-art logo :-), and the file editing is very reminescent of whast Turbo Pascal 7.0 has brought.

FP (Free Pascal) editor intro ascii-cart

Another difficulty is discovering how the basic libraries in Turbo are actually limited as well – the compiler wasn’t supposed to bloat our resulting binary with megabytes of code, obviously, because no regular computer had megabytes of memory. So if you want “big integers”, string splitting, or lists, you probably have to code it all yourself.

Last but not least, the error messages were notoriously bad compared to modern compiler ones. Most of them are along the unexpected ; theme, and you have to figure out why would it be unexpected at a given line.

But I did write the solution using fp, and compiled and ran the Mac/PC version with fpc, and was pretty proud of myself. I even added unit tests that execute before the main logic, to test that basic split or list operations work correctly!

Porting it to CP/M Turbo Pascal 3.0

Here’s where the bumpy ride starts. I believed if something compiles and runs on a fairly small dataset (the input file for the challenge is 22kB) in Turbo Pascal compatibility mode, it will run in Turbo Pascal too, right? Of course wrong.

Making it compile again

Here’s how we learn that a lot of what Free Pascal allowed me to do was not allowed by Turbo Pascal 3.0.

No “just string” type.

In the FPC version I’ve used string a lot. You would expect a trim function to take a string argument and return a string as well:

function trim(input: string): string;

This turned out to be invalid for not one, but two reasons. The first one is that string is not a valid type definition in Turbo Pascal. Let’s consult the manual:

String Type Definition

The definition of a string type must specify the maximum number of
characters it can contain, i.e. the maximum length of strings of that
type. The definition consists of the reserved word string followed by the maximum length enclosed in square brackets. The length is specified by an integer constant in the range 1 through 255. Notice that strings do not have a default length; the length must always be specified.

Example:

type
FileName = string[14];
ScreenLine = string[80];

String variables occupy the defined maximum length in memory plus one byte which contains the current length of the variable. The individual
characters within a string are indexed from 1 through the length of the string.

So we always have to specify the maximum length of the string upfront (so that the compiler can reserve exactly that many bytes), and it also cannot be greater than 255. After updating string declarations everywhere, you get:

function trim(input: string[200]): string[200];

And it works now, right? Wrong. We’ve just advanced to another half-friendly compilation error:

Compiling
117 lines

Error 36: Type identifier expected. Press <ESC>

Which, after pressing ESC takes us to the bad line. But I did specify the return type, didn’t I?! Well, no.

Notice that the type of the parameters in the parameter part must be
specified as a previously defined type identifier. Thus, the construct:

procedure Select(Model: array[l .. 500] of Integer);

is not allowed. Instead, the desired type should be defined in the type
definition of the block, and the type identifier should then be used in the
parameter declaration:

type
Range ~ array[l .. 500] of Integer;

procedure Select(Model: Range);

Let’s decide that 200 is more characters than we need for this program and that LongString is an appropriate name for our type. We end up with function trim(input: LongString): LongString;.

Yart! Yet another return type

Same problem affected our simpleSplit function responsible for splitting a longer string into two substrings on where a delimeter i found (so we can break a|b into a and b). We cannot declare:

function simpleSplit(input: String; delimeter: char): array[0..1] of LongString;

We need to first define our own type TwoStrings for an array of exactly two strings. Okay, fine. And then another surprise:

Compiling
96 lines

Error 48: Invalid result type. Press <ESC>

Of course it won’t tell you what’s invalid about it, because such error message would be a waste of memory, and increase compiler size. Let’s consult the manual again:

The result type of a function must be a scalar type (Le. Integer, Real, Boolean, Char, declared scalar or subrange), a string type, or a pointer type.

And since returning a pointer here would be too much syntax to treat it as a well-defined type as well, we end up with a type PTwoStrings = ^TwoStrings; pointer type, being the return type for our split function.

This restriction makes sense, and feels “close to metal” too. The compiler doesn’t have the resources (time and memory) to figure out how and where to allocate memory for this array. A modern one would allocate the return value on the heap and pass a pointer to it under the hood – here, we have to do it ourselves. At this point, I question my statement that C felt much closer to metal than Pascal.

Not the same built-in string procedures

Our simpleSplit still doesn’t compile. In the Free Pascal Compiler version we had:

  simpleSplit[0] := copy(input, 0, splitAt-1);
  simpleSplit[1] := copy(input, splitAt+1);

First thing to change is that the result needs to be a pointer, so it would be simpleSplit^[0] on the left side of the assignment, but that wouldn’t parse (the expectation is to assign to the function name, not to a dereferenced name), so we have a local var retVal: PTwoStrings, and we can do retVal^[0] := ... ; retVal^[1] := ... ; simpleSplit := retVal;.

But we still get a “, expected” error on the second line. Turns out that the built-in (or, as Turbo Pascal manual calls it, “Standard procedure”) Copy could accept two arguments if we wanted to copy from position i to the end of the string, but Turbo doesn’t. Luckily, if you specify the end position too far, it will just copy until the end of the string, which is what we wanted, so we make it a copy(input, splitAt+1, 255);.

Spoiler alert: I wouldn’t find out about it until much longer, but there’s still one more bug. Strings in Pascal are 1-indexed, meaning that the first character is actually str[1] and not str[0] like in most modern languages (except maybe MatLab). Leaving the 0 as the start index would produce a runtime error.

Break? There is no time for a break. What? “Break out of a loop?” Breakout is a game.

Our trim implementation for Free Pascal seeks a non-whitespace (specifically non-tab and non-space, to simplify) character from the beginning of the string, and once it’s found, it breaks out from the loop:

  pFrom := 0;
  pTo := Length(input);For i := 1 To pTo Do
    Begin
      c := input[i];
      If (ord(c)<>32) And (ord(c) <> 9) Then
        Begin
          pFrom := i;
          break;
        End;
    End;

Turbo Pascal 3.0 yells at us on the line with the break statement, mentioning an invalid identifier. But it’s not a variable/function identifier, it’s a keyword! No mention of break keyword in the manual, though.

We bring out the “popular to hate” tools. Labels and goto. Since Pascal is a single-pass compiled language, it has to know that a label we can jump to exists in the code before it’s being referenced. We therefore add a label fromFound, toFound; statement just before our function’s begin, so that it knows these identifiers may appear as labels later.
We replace break with goto fromFound and the nested break with goto toFound.

Notice the generosity:

Label Declaration Part
Any statement in a program may be prefixed with a label, enabling direct branching to that statement by a 90to statement. A label consists of a label name followed by a colon. Before use, the label must be declared in a label declaration part. The reserved word label heads this
part, and it is followed by a list of label identifiers separated by commas and terminated by a semi-colon.

Example:
label 10, error, 999, Quit;

Whereas standard Pascal limits labels to numbers of no more than 4 digits, TURBO Pascal allows both numbers and identifiers to be used as labels.

The last paragraph reminds us that standard Pascal defined labels as simple as the first BASIC line numbers or Fortran statement numbers. In Turbo, we can at least use words.

The ‘new’ keyword

While Turbo Pascal had pointers, the natural way to allocate memory for a new instance of a variable the pointer would point to, value := new(Type); that worked in Free Pascal and is familiar to any other language with object instantiation, does not work this way in Turbo Pascal 3.00, and actually in standard Pascal at all. It’s syntax borrowed from other languages, that happens to be supported by FPC.

The Pascal way to call new is different. It should be new(pointerVar); – no assignment needed.

Making it work

At some point, it finally compiles again! The first thing I saw when it ran was, of course, FAIL – failing a test, and exiting.

At this point, I was really thankful to myself for writing unit tests in the first place. Otherwise, debugging the entire app result would be a much harder task. Which I ended up doing anyway, for the same reason – not all functions were covered by tests. It got me through basic splits and list operations, though.

Wasting memory

When testing the first solution on a PC, the amount of memory was obviously never a concern. This, plus bad habits I got during the last 11+ years of programming in languages that do garbage collection automatically, I’ve ended up in a place when after Card 35 (of over 130 to process) I got a big, nice, runtime error FF:

FF Heap/stack collision.
A call was made to the standard procedure New or to a recursive subprogram, and there is insufficient free memory between the heap pointer (HeapPtr) and the recursion stack pointer (RecurPtr).

At this point, adding a lot of Dispose statements to free allocated pointers happened, including the recursive disposeList call (fun caveat here: the manual specifies, however, that on 8080 CPU recursion requires a special compilation flag, so remember about that if you’re targeting CP/M-80):

Procedure disposeList(list: PNumberList);
Begin
  If list^.next <> nil Then Begin
    disposeList(list^.next);
  End;
  Dispose(list);
End;

We’ll get back to this function later, as it was disabled soon after being added, as it was freezing the program.

Memory leak + compiler bugs?

Even with the above problem conveniently ignored, I still had a memory leak. I’ve added printing MemAvail on every card, and it clearly decreased 140 bytes on each iteration. This is exactly how much memory we’d need to store 35 numbers (10 winning and 25 on the card) in our NumberList type – two bytes for each number, and two for the pointer to the next number.

In the solution at this time, I intended to release memory after each card is processed, :

DisposeList(c^.winning); { first, remove both list tails }      DisposeList(c^.have);      
Dispose(c^.winning);     { then pointers to both lists }      Dispose(c^.have);      
Dispose(c);              { then card itself }

with DisposeList being unchanged.

Somehow, after fixing a bug where original disposeList didn’t free any memory, I got stuck with the program freezing completely on new(retVal) statement, as well as any attempt to call MemAvail after that Dispose was called (code on the left, result on the right below).

Since this still worked well if compiled with FPC on macOS, but froze when compiled with Turbo Pascal 3.00 or 3.01 on either real CP/M or – my new best friend tool when I wanted to test rapidly – CP/M For Mac OS², which for now was enough to conclude it’s a bug in the compiler generated code itself. Otherwise it would just throw either a compile-time or a runtime error on any of the: Dispose, MemAvail or New calls.

From here, our path forks onto two pathways:

Digging into how exactly memory is managed with new and dispose, and debugging the issue further. Since the average programmer had to deal with the risk of running out of memory much more often than today, this mechanism is actually pretty well described in the manual as well, so we can at least know what it intends to do.
Switching to Mark and Release – Turbo Pascal offered one more, exteremely simplified memory management method. Instead of tracking each allocated pointer individually, this pair allows us to just Mark and remember how much memory was free at some point, and later on just Release it, resetting the pointer to the former value, discarding everything that happened since we called Mark.

At the point where built-in memory management was failing me hard (freezing the whole program), I admitted to not doing so well publicly, and got some extra pair of eyes on my problem. Interestingly enough, for the user @psychotimmy, even my version worked correctly when compiled for the RC2014³!

Solving all the obstacles above did get me to where the entire program works!

The switch to Mark/Release did the trick, however it does seem a litle hacky. Checking why the regular memory management on a single pointer level fails here can lead us to a material for another article. Luckily, the algorithm of memory allocation and counting free blocks is very well described in Pascal’s manual as well – because the programmer needed to be aware of every byte of the memory, as a limited resource.

Integer size

Small note worth mentioning here, as it almost stood in the way of solving the task in question: 8-bit Turbo Pascal (and for that matter, any TP 3.0) only knows two integer sizes: byte (8-bit) and integer (16-bit). On one hand makes sense, as CPUs such as Z80 don’t do math any further, but it also limits all built-in integer computation to ~~65535 or~~ 32767 as the maximum number.

That’s it. No “Big int”, no “Long”, and the only other numeric type left is a 6-byte Real (popularly known as float in other languages).

In the Advent of Code context, doing the math on a Real type turned out to be good enough, even though the result must be an integer in the order of magnitude of 10 million. However, it’s important to remember that it is never a good idea to do precise math like this, or like payroll, on float types. The errors accumulate more than you’d think, even in modern computing.

Editing

Maybe not surprisingly, but something you may forget: when you want to use the actual editor from the tool working on the target platform, even navigating fairly simple couple hundreds of lines of code can be challenging.

Back at the time, the gold standard for text editing was WordStar. If it was ed, the arrow-keys-impaired vim users of today would rejoice⁴. But it was WordStar. You know, back then things were still very much in motion, when it comes to text editing, there was a number of standards on how to handle keyboard input, what keyboard shortcuts work best, and even that keyboard layout you may want.

MIT “space-cadet” keyboard, a pre-ISO/IEC 9995 keyboard.

Wordstar used a fairly reasonable set of keyboard shortcuts to navigate around the file, for example with most of the left-hand side of the keyboard controlling the cursor movement:

Up, down, left right by 1 character/line are, respectively: (^ stands for Ctrl+) ^E, ^X, ^S, ^D
Scrolling down and up by one line: ^W, ^Z
Scrolling up and down by a screen: ^R ^C
Left/right by one word: ^A, ^F.

However if you’ve trained your muscle memory to do ctrl/alt+left/right, page up, page down (this is not a pair of keys that have always been there), it may need some more getting used to. Since the program was short, I haven’t really gotten past these few, plus the combination for find, which is… Ctrl+Q, F, which then asked for options. Never checked, but guessed correctly that “b” will be backwards, and that’s all I needed.

Cross-compilation definitely sounds more tempting now.

Recap

While Pascal has made significant contributions to programming, and has been one of the most feature-rich languages of the time, using it today once we’re accustomed to more modern languages, and porting to older versions of the language/compiler presents challenges. The limitations in libraries, lack of full-featured support in popular editors, and the need to adapt code for compatibility with older versions of Pascal highlight the complexities of working with this language in the present day. Despite these challenges, the experience of navigating these obstacles can provide valuable insights into the evolution of programming languages and the advancements that have shaped the field of software development.

Footnotes

the most popular one is: vscode-language-pascal – VS Code Pascal plugin
while this seems to be a little more feature-rich and with fewer dependencies: OmniPascal ↩︎
https://github.com/TomHarte/CP-M-for-OS-X ↩︎
https://rc2014.co.uk/ – RC2014 homebrew Z80 computer ↩︎
vi and therefore also its child vim seem to be heavily inspired by ed,
see https://lunduke.locals.com/post/4400197/the-true-history-of-vi-and-vim and https://francopasut.netlify.app/post/golden_line/ ↩︎

Limitations of 8-bit Turbo Pascal 3.0, case study

Background

Notes on coding in Pascal in 2024

Porting it to CP/M Turbo Pascal 3.0

Making it compile again

No “just string” type.

Yart! Yet another return type

Not the same built-in string procedures

Break? There is no time for a break. What? “Break out of a loop?” Breakout is a game.

The ‘new’ keyword

Making it work

Wasting memory

Memory leak + compiler bugs?

Integer size

Editing

Recap

Footnotes

Like this:

Related

Limitations of 8-bit Turbo Pascal 3.0, case study

Background

Notes on coding in Pascal in 2024

Porting it to CP/M Turbo Pascal 3.0

Making it compile again

No “just string” type.

Yart! Yet another return type

Not the same built-in string procedures

Break? There is no time for a break. What? “Break out of a loop?” Breakout is a game.

The ‘new’ keyword

Making it work

Wasting memory

Memory leak + compiler bugs?

Integer size

Editing

Recap

Footnotes

Share this:

Like this:

Related

Widgets