Build pipeline

I have my pipeline in a place where I can start using it. Not from the JetBrains tools, yet, like I wanted but at least in a place where I can get some “real” work done.

Again, I want to call out that this is not only work I’ve done. I’ve borrow heavily from others to get my persona build pipeline in place.  The old saying of “I can see further, for I stand on the shoulders on giants” is very fitting, but I more prefer “I can dig deeper, for I stand in the trenches of hackers.”  Seems more fitting.

Here is a quick demo of the build pipeline:

A few things to note with what I put together (which you will see further down in another example).  Specifying EXE= with make will do a few things if left alone.  For example, if you specify EXE=foo with the target of “test”, it will

  1. Compile foo.c into foo.o
  2. Link foo.o into foo
  3. Build a disk image called foo.po with ProDOS, Basic and foo.
  4. Start a test machine under Virtual ][ and insert the foo.po disk image
  5. Once it is booted it will BRUN foo

As this may and will not fit all case, I added a few extras.  You can specify SRCS= to be a list of files (including the base one), it will compile all those files and link all those files into the EXE value.  For example:

# make EXE=foo SRCS="foo.c bar.c fun.c"

Will compile foo.c, bar.c and fun.c into .o and then link them into foo.  Keep in mind you do need to include foo.c, you don’t get that for free.

As far as targets, there are a couple:

  1. No target or “all” will compile and link
  2. Target of “image” will do #1 plus build  the disk image
  3. Target of “test” will do #2 then boot the disk image and BRUN the executable.

I’ll put this all up on GitHub so anyone can use it if they want..

Testing some “real” code

Next, I wanted to test out some of the code I’ve been working on for my game.  Just some preliminary code to do some performance testing vs memory utilization.

I had two ideas of how I can represent a column of blocks for use in the game.  Given the following conditions:

  • A level needs two boards.  The initial layout of the board and the solution.
  • A board can have eight columns
  • A column can hold up to eight blocks
  • I wanted to be able to have at least three types of blocks

I came up with the following implementations:

  1. 2 bits per block = 2 bytes per column = 16 bytes per board = 32 bytes per level
  2. 1 byte per block = 8 bytes per column = 64 bytes per board = 128 bytes per level

128 bytes is not that much, but I wanted to consider both options.  The code for option #1 is much trickier than option #2 due to bit masking and shifting, but the space needed for #2 is four times the size.  So, the the unanswered question is which one is faster?  I predicted that that #2 would be faster due to all the funky bit math #1 has to do.

To test this, I coded up both implementations then ran them through it’s paces.  The code starts with a full column, takes all eight of the blocks out, then puts eight blocks back in.  It then does this 1000 times in a loop.

For the timing, I took some code from Bill Buckels I found on in a post on CSA2P and I converted it to a “library” that I could start a timer, get elapsed time and even do a “lap” timer.  It does require a No-Slot clock but luckily Virtual ][ can do that!  I’ll get that “library” up on GitHub, as well.

Here is a quick video showing the pipeline in action with multiple source files and the performance testing of the two implementations:

The winner in performance is the “string” version, which is the 1 byte per block with 1865 vs 2853 for the “bit” version.  FYI, the timer is in “centiseconds”, so it’s about 18.65s vs 28.53s or almost 10 seconds faster for the “string” version.

Using the “string” version has some other side benefits.  The code is cleaner and much more straight forward and it does allow me to expand the game to have a larger set of blocks, if I wanted.

Now to move on to implementing the boards, levels and the engine to hook them all together!

Tips welcome and strongly encouraged.

Based on tip from Quinn Dunki, I also wanted to explore Cadius as an alternative to AppleCommander.

Cadius is from BrutalDeluxe and is their ProDOS disk imaging utility, so I thought I’d get it installed and take it through it’s paces.

Manual install?  Umm, no thank you.

With as handy as HomeBrew is, not everything is in there.  As mentioned in the previous post, there is a set of formulae for Apple II utilities available here, which has a lot of great things in it.  One of the things NOT in there is Cadius.

I could have just installed Cadius using the instructions found on Cadius’ GitHib page, but what fun is there in that?  Time to learn to make a HomeBrew formula and make the (Apple II) world a little bit better.

I created the HomeBrew formula for it, which was very simple using the instructions found in the Formula Cookbook.  I have a pull request open against the HomeBrew Apple II repo to get it officially included.  If you want to use it before then, you can grab it from here.  One you have it you can install Cadius with:

Image in ProDOS’ self

Now that we have Cadius installed we need to create a disk image, that’s pretty easy:

Despite it being a valid ProDOS volume, it won’t boot like this. Similar to how DOS 3.3 disks, we need to do some extra things to this to make it bootable.  Namely, we need to put on BASIC.SYSTEM and PRODOS from another bootable ProDOS disk.

I was able to find the ProDOS system master on Call-A.P.P.L.E’s site here.  But, I realized it’s a .DSK in DOS order which Cadius will not read.  Fear not!  A quick search pointed me to a script by Paul Hagstrom called, oddly,  I downloaded that and fixed shebang line to use python and not python3 and converted the disk image (Thanks Paul!):

Now, I want as much space on my default ProDOS disk so I just want just the BASICs (see what I did there?) to get the disk bootable.  Let’s copy the needed files over to the new image that was just built:

And now for the boot:

So far, so good.

Back to the past

In the last post I added “helloworld” to a DOS 3.3 disk image with AppleCommander, booted it and ran the executable.  Let’s try that again, but using Cadius to put it on the new disk image and try the same thing.  This will prove that Cadius can be used in place of AppleCommander in my build pipeline.

And the pudding:



Cadius seems like a great alternative to AppleCommand IF you want to only work with ProDOS disks, but it was much more straightforward (detecting the AppleSingle format that we needed).

I think I’ll use it in my build pipeline going forward.  I guess I should stop playing around and should probably write some code for my game as well!


First thing I wanted to get setup was at least get the basics of my build pipeline installed.  This is based on a blog post by Quinn Dunki, things have changed a bit and I wanted to verify the steps.

To get what we need, we will need to install the the following:

  1. cc65 – To compile the code
  2. AppleCommander – To put the compiled executable onto a floppy disk image
  3. Virtual ][ – To boot and test the image

Installing cc65

Installing cc65 on the Mac is pretty easy as there is a bottle under Homebrew to install it directly.  If you don’t have or have never used Homebrew, it’s a great way to install extra software in a way similar to MacPorts or Fink. Homebrew is easy to install, just follow the directions on their site.

Then to install cc65, simply do:

That’s it.  Pretty simple.  But, let’s try a simple test compile:

So far, so good.

Installing AppleCommander

For the build pipeline, we also need AppleCommander.  Luckily, we can also install this with Homebrew by using this Apple II homebrew repository.  First we need to “tap” the repository then we can install AppleCommander from there.

But, let’s make sure this part is working as well.  Let’s put the “helloworld” executable on a DOS 3.3 bootable disk and try it out.  You can get ”

Again, looks good so far.  Next!


  • The image created by -dos140 will not be bootable and, as you will see below, we will boot the DOS System Master 3.3 disk then run our image off the other disk.  In the final, I’ll INIT the test disk before hand and reuse it as needed to boot
  • When putting the executable on the disk, you’ll need to use -as (AppleSingle) flag when using AppleCommander.  This replaces the -cc65 flag mentioned in Quinn’s post.

Installing Virtual ][

So far, we’ve been able to ride the build pipeline for free, but here is where need need to get off that train and pay piper (See what I did there?).  To me, Virtual ][ is the go to emulator for the Mac and is worth every penny it costs.  It 44USD for the full license.  Like I said worth every penny. You can get Virtual ][ here.

Once you install it, you’ll need to get the correct ROM for the machine you want to run.  I run an Apple //e as my physical machine, so I also like to run a //e as my virtual testing environment.  You can find the ROM you need without a lot of digging, so I’ll leave that as a exercise for the reader.

The important part about using Virtual ][ that I’m not sure other emulators do, is that it has AppleScript support so it can be controlled from scripts.  This is important to the build pipeline so it can load and boot the disk image as part of the build.

To verify Virtual ][, let’s boot the “Apple_DOS_3.3_Master.dsk” (found on in drive 1 and the image we created above in drive 2.

Everything looks good.

Notes/updates compared to Quinn’s post (i.e. TL;DR)

  • cc65 can be installed via Homebrew.
  • AppleCommander can be installed via Homebrew after adding the Apple II homebrew repository.
    • The AppleCommand command line executable is “applecommander” not “ac”
    • The “-cc65” flag has been removed and you need to use “-as” or “-dos” as appropriate. In our case, it’s “-as”


Next, I’ll be looking to generate a CMakefile (which CLion uses) to do similar work and link these together.

In order to help motivate me to get some retro-computing work done, I signed up for RetroChallenge 2018/04.  If you don’t know what RetroChallenge is, well it’s an informal contest (that’s not really a contest) for doing retro-computing related projects.  Basically, it’s a way to help incentivize retro-computing enthusiasts to get off their butts and do something cool.

For my project, I have a main goal and some sub goals.  Mainly so I can feel like I accomplished something even if I don’t completely finish it in April.

Here is what I’m planning on doing for this challenge.   I’m planning on writing a game for the Apple ][ that is similar to an iPad game called CargoBot (iTunes link), which is a box sorting game that you need to program the crane to sort the boxes in the fewest amount of commands as possible.

Here are my goals as part of this project.  Mostly in order, but who knows. Feel free to keep score, if you like.

  1. Develop a build pipeline for use with the JetBrains tools (IntelliJ/CLion) similar to, and where I will borrow from, work reference by Quinn Dunki in this blog post.  As she mentions in the post, she is standing on the shoulders of (bald) giants, so I guess I will be standing on (the shoulders of giants)².  And, yes, I’m (mostly) bald, as well, so there is that.
  2. Build the core engine while comparing the tradeoffs for memory usage and performance by trying compact/verbose level formats.  Because, the geek in me wants to make it as small as possible but the gamer in me wants to to actually play it.
  3. Make the rendering of the game to modular so I can render it in any of the following modes:
    1. Text – Easiest way to vet out the engine.
    2. LoRes – Because I can.
    3. HiRes – Because I should.
    4. Double-HiRes – Because I shouldn’t, but I’m gonna anyways.

Some things you might see along the way, so don’t be frightened:

  1. Banging my head on my desk in frustration, because impact-maintenance is a real thing.
  2. Me pulling my hair out, which is a challenge in-and-of itself (see reference to being (mostly) bald above).
  3. Goofy graphics issues.  I’ve played with some HiRes stuff earlier and it’s “interesting” (Minnesota slang for “sucky”).  If you want to point and laugh early, you can look back and see the struggles I’m probably going to have to go through again.
  4. Swearing.  Well, only if you are nearby when this is all happening.  I’ll keep the blog family-friendly but I’ll make no such promises for real-life.

I encourage you to follow along.  We can laugh together (or you can laugh at me), we can cry together (as there may be tears) and hopefully we can play together (well, not together, but you know what I mean).

Let the challenge begin! (Well, tomorrow)

I mentioned in an earlier post that I would post about getting timing routines into my PLASMA test code as well as the C code. Needless to say, it took me way more banging my head and a late night to get it in and working. But, that was all me. I failed to RTFM and then tried to figure out why it wasn’t working

But, let’s back up a bit to give credit where credit is due. I stole/borrowed/adapted the clock routines I’m using in my code (both for PLASMA and C). I finally found a post on comp.sys.apple2.programer from Bill Buckels that had the code in raw opcode format which was then memcpy()’d into the right location in memory (in this case $0260) and then accessed via inline assembly via a JSR call to the right spot.  Brilliant!

But, with my lack of understanding of how PLASMA is laid out, I figured I had better do something a little more portable.  I tried several things trying to convert it to inline assembly on my own, I tried taking the assembly spit out from the monitor and converting the raw memory locations to logical offsets which involved using Virtual ][, printing the ML from the monitor to the virtual printer, saving as PDF and copy/pasting from there.  Which was a nightmare as the output in the PDF is not sequenced how I would have expected:

Screen Shot 2016-04-11 at 11.27.14 AM

Thanks to a tip from David Schmidt, I took a look at the code in ADTPro for the clock routines.  That’s all in assembly with logical offsets!  Woohoo!  I converted it into the assembly style that PLASMA wants and gave it a shot.  No go.  Time to figure out why.

At this point, I wish I has taken some screenshots of what I was doing as it would be nice to have.  I’ll be better about that in the future.

I had my PLASMA code print out the memory location ($4047, I think it was) for the function that as the inline assembly in it and went into the monitor and took a look.  If you look at the code in the picture above, you can see that the first STA instruction is $7e after the start of the routine ($260).  You’d expected to see the STA of this new code to be $7e past $4047, right?  Nope!  It was at $10B2. Well, there’s your problem.  I could get the offsets to be right in that code if I used “–setpc 16401” on the call to the ACME assembler, but then the entrance location to my PLASMA code was off and nothing would run.

After hours of digging around and trying various things, I decided I needed to reach out to see if I hit a bug (unlikely) or if I was doing something wrong (very likely).   After posting to comp.sys.apple2.programmer, David Schmenk got be straightened around.

Here is where the RTFM failure part comes in.  Here is a section from the PLASMA readme about Native Assembly Functions:

Lastly, PLASMA modules are re-locatable, but labels inside assembly functions don’t get flagged for fix-ups. The assembly code must use all relative branches and only accessing data/code at a fixed address.

Then I set off on a “damn fool idealistic crusade” to implement the code code in C (and then PLASMA) directly.  I tried.  Boy, did I try. But, apparently my reading of the assembly and trying to do it in something else was failing miserably. I tend to do that.  Wanting to do things the “right” or “best” way instead of doing it the “working way”.  Sometimes, it’s best to just use the “working way”.  Especially, since I only wanted it to do some performance testing.

Back to using the raw code and memcpy()’ing it in.  That was working fine, except my loop from 1 to 10 in my test program ran way more than 10 times.  I realize now, this was RTFM failure #2:

Data passed in on the PLASMA evaluation stack is readily accessed with the X register and the zero page address of the ESTK. The X register must be properly saved, incremented, and/or decremented to remain consistent with the rest of PLASMA. Parameters are popped off the evaluation stack with INX, and the return value is pushed with DEX.

David to  the rescue again.  Added in the code to save/restore X and DEX and good to go!

Here is the code for the timers. It’s basically a simple stopwatch with one lap timer included. Start the timer then you can ask for the elapsed time. You can do a lap reset to get individual times while the main timer is unaffected.


import cmdsys
    predef memcpy

const nscdata = $303

byte timer_year, timer_month, timer_date, timer_day, timer_hour, timer_minute, timer_second, timer_hundredth
byte lap_year, lap_month, lap_date, lap_day, lap_hour, lap_minute, lap_second, lap_hundredth
byte tmp_year, tmp_month, tmp_date, tmp_day, tmp_hour, tmp_minute, tmp_second, tmp_hundredth

byte nsccode[] = $a9,$00,$8d,$de,$02,$a9,$03,$09,$c0,$8d,$1f,$03,$8d,$22,$03,$8d,$31,$03,$8d,$3f,$03,$a9,$03,$8d,$df,$02,$d0,$16,$00,$00,$00,$00
byte           = $00,$00,$2f,$00,$00,$2f,$00,$00,$20,$00,$00,$3a,$00,$00,$3a,$00,$00,$8d,$20,$0b,$03,$a2,$07,$bd,$03,$03,$dd,$e0,$02,$90,$0f,$dd
byte           = $e8,$02,$b0,$0a,$ca,$10,$f0,$ce,$df,$02,$d0,$e6,$18,$60,$ee,$de,$02,$ad,$de,$02,$c9,$08,$90,$af,$d0,$1d,$a9,$c0,$a0,$15,$8d,$1b
byte           = $03,$8c,$1a,$03,$a0,$07,$8d,$1f,$03,$8c,$1e,$03,$88,$8d,$6f,$03,$8c,$6e,$03,$a9,$c8,$d0,$95,$a9,$4c,$8d,$16,$03,$38,$60,$00,$00
byte           = $00,$01,$01,$01,$00,$00,$00,$00,$64,$0d,$20,$38,$98,$3c,$3c,$64,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00
byte           = $18,$90,$09,$00,$00,$00,$00,$00,$00,$00,$00,$38,$08,$78,$a9,$00,$8d,$04,$03,$8d,$80,$02,$ad,$a3,$03,$ad,$ff,$cf,$48,$8d,$00,$c3
byte           = $ad,$04,$c3,$a2,$08,$bd,$bf,$03,$38,$6a,$48,$a9,$00,$2a,$a8,$b9,$00,$c3,$68,$4a,$d0,$f4,$ca,$d0,$ec,$a2,$08,$a0,$08,$ad,$04,$c3
byte           = $6a,$66,$42,$88,$d0,$f7,$a5,$42,$9d,$7f,$02,$4a,$4a,$4a,$4a,$a8,$a5,$42,$c0,$00,$f0,$08,$29,$0f,$18,$69,$0a,$88,$d0,$fb,$9d,$02
byte           = $03,$ca,$d0,$d7,$ad,$80,$02,$8d,$83,$02,$68,$30,$03,$8d,$ff,$cf,$a0,$11,$a2,$06,$bd,$c7,$03,$99,$80,$02,$bd,$80,$02,$48,$29,$0f
byte           = $09,$30,$88,$99,$80,$02,$68,$4a,$4a,$4a,$4a,$d0,$0c,$e0,$01,$f0,$04,$e0,$04,$d0,$04,$a9,$20,$d0,$02,$09,$30,$88,$99,$80,$02,$88
byte           = $ca,$d0,$d1,$28,$b0,$19,$20,$be,$de,$20,$e3,$df,$20,$6c,$dd,$85,$85,$84,$86,$a9,$80,$a0,$02,$a2,$8d,$20,$e9,$e3,$20,$9a,$da,$60
byte           = $5c,$a3,$3a,$c5,$5c,$a3,$3a,$c5,$2f,$2f,$20,$3a,$3a,$8d

asm _initnsc
        jsr $0260

asm _readnsc
        jsr $030B

export def loadnsccode
    memcpy($0260, @nsccode, $16e);

export def initnsc

export def gettime(timedata)
    memcpy(timedata, nscdata, 8)

export def timer_start
    memcpy(@lap_year, @timer_uear, 8)

export def timer_elapsed
    word d, h, m, s, hd
    d = tmp_date - timer_date; h = tmp_hour - timer_hour; m = tmp_minute - timer_minute; s = tmp_second - timer_second; hd = tmp_hundredth - timer_hundredth;

    return (((d*24+h)*60+m)*60+s)*100+hd

export def timer_lap_reset

export def timer_lap_elapsed
    word d, h, m, s, hd
    d = tmp_date - lap_date; h = tmp_hour - lap_hour; m = tmp_minute - lap_minute; s = tmp_second - lap_second; hd = tmp_hundredth - lap_hundredth;

    return (((d*24+h)*60+m)*60+s)*100+hd


C Code

Adapted from a post by Bill Buckels

#include <stdio.h>
#include <string.h>
#include <conio.h>
#include "realtime.h"

#define READ_TIME_ADDR 0x260
#define READ_TIME_LEN  366

/* The READ.TIME program Version 1.4 (C) Copyright Craig Peterson 1991 */
char _read_time[READ_TIME_LEN] = {

struct nsctm timer, lap, tmp;

#pragma optimize (push,off)
void initnsc(void)

    char *brunptr = (char *)READ_TIME_ADDR;

    /* bload read.clock to $260 */

	asm("JSR $260"); /* call init clock */

#pragma optimize (pop)

/* read the current date time and time from the NSC */
#pragma optimize (push,off)
void gettime(struct nsctm *output)
	asm("JSR $30B"); /* call read clock */

    memcpy(output, (char *)0x303, 8);
#pragma optimize (pop)

void timer_start()
    memcpy(&lap, &timer, 8);

int timer_elapsed()
    int d, h, m, s, hd;
    d = -; h = tmp.hour - timer.hour; m = tmp.minute - timer.minute; s = tmp.second - timer.second; hd = tmp.hundredth - timer.hundredth;

    return (((d*24+h)*60+m)*60+s)*100+hd;

void timer_lap_reset()

int timer_lap_elapsed()
    int d, h, m, s, hd;
    d = -; h = tmp.hour - lap.hour; m = tmp.minute - lap.minute; s = tmp.second - lap.second; hd = tmp.hundredth - lap.hundredth;

    return (((d*24+h)*60+m)*60+s)*100+hd;

Again, strikingly similar, but that is what I was after. Comparing apples to apples (pun intended!)

Next I’m going to take a look at some more comparisons. Thing I was to look at (some based on suggestions) are things like timings for different routines, cycles for different operations and size comparisons.

I wanted to get some timings for PLASMA vs C for a few operations. I’m sticking with my “moving monster” theme and tracked the time for doing two different operations.

  1. Drawing a frame of the monster (100 times), which involves
    • Flipping HGR pages
    • Getting the page address, getting Y address (lookup),  the byte for X (lookup) and adding them together
    • Getting frame for X (lookup) and calculating the offset to get to the correct frame
    • Memcpy() the data to memory
  2. Do a simple no op for loop from 1 to 500 (100 times)

I fully admit that this is not an exhaustive test, but I just wanted to get an idea of how they compare. Again, this is not a “C is faster/better” post. PLASMA is impressive tech regardless of the times. It’s just out of my pure curiosity.

I added pretty much identical timing routines on the PLASMA and C side (after much time spent banging my head), but I’ll post on that later.


Note: Times are in cs (centiseconds, i.e. 100ths)

100 Frames

C: 147 cs
PLASMA: 228 cs (155%)

Loop 500

C: 530 cs
PLASMA: 1368 cs (258%)


Because, I like videos.



While playing around with PLASMA and working a some timing routines (more on that later), I found I needed to expand my build chain to be able to include multiple PLASMA modules in to one disk when booting.

I also didn’t like having to specify an environmental variable to set the source file for simple builds. For building a single .pla file and running it, I wanted something easier. This new makefile satisfies both of those requirements.

I did end up moving away from generating the # style files that I think CiderPress wants. Mainly because I’m using AppleCommander to build my disk images. I decided to use .mod (PLASMA “module” was the inspiration) as the intermediary file extension.



.PRECIOUS: %.dsk

	-rm -f *.a *.mod %.dsk
	osascript plasma_run.scpt `pwd` $*

%.dsk: %.mod $(patsubst %,%.mod,$(EXTRA))
	cp template.dsk [email protected]
	java -jar AppleCommander.jar -d [email protected] $*
	java -jar AppleCommander.jar -p [email protected] $* $(DSKTYPE) 0x$(ADDR) < $*.mod
	-if [ ! -z "$(EXTRA)" ]; then \
		for o in "$(EXTRA)"; \
		do \
			java -jar AppleCommander.jar -d [email protected] $$o ;\
			java -jar AppleCommander.jar -p [email protected] $$o $(DSKTYPE) 0x$(ADDR) < $$o.mod ;\
		done ;\

%.mod: %.a
	acme --setpc 4094 -o [email protected] $?

%.a: %.pla
	plasm -AM < $? > [email protected]


This can be used in a few different ways. This simplest is to just run make passing in the name of your .pla file with “.pla” replaced with “.dsk” to build the disk image, or “.run” to build the disk image and boot it in Virtual ][.

You can technically even run make and use “.a” and get the .a file out of PLASMA. It’s all generic so any of the intermediaries will work. Use “.mod” to get the compiled binary file, you can then use that with whatever tool you’d want to put it on a disk.

To have it build and include additional PLASMA modules, set the EXTRA environmental variable to the list of the files to include without the .pla extension

Note: Besides the .dsk (which is marked as .PRECIOUS) all intermediaries are removed.


Make .dsk
% ls -l hello.pla
-rw-r--r--  1 mfinger  staff  65 Apr  8 21:42 hello.pla
% make hello.dsk
plasm -AM < hello.pla > hello.a
acme --setpc 4094 -o hello.mod hello.a
cp template.dsk hello.dsk
java -jar AppleCommander.jar -d hello.dsk hello
hello: No match.
java -jar AppleCommander.jar -p hello.dsk hello rel 0x1000 < hello.mod
if [ ! -z "" ]; then \
		for o in ""; \
		do \
			java -jar AppleCommander.jar -d hello.dsk $o ;\
			java -jar AppleCommander.jar -p hello.dsk $o rel 0x1000 < $o.mod ;\
		done ;\
rm hello.mod hello.a
% java -jar AppleCommander.jar -ll hello.dsk

  PRODOS  Destroy Read Rename Write SYS  035 09/19/2007 05/06/1993 17,128 $0000 0002 0008 Sapling Changed 0 4
  CMD  Destroy Read Rename Write SYS  010 04/01/2016 04/01/2016 4,141 A=$2000 0002 0029 Sapling Changed 0 0
  HELLO  Destroy Read Rename Write REL  001 04/08/2016 04/08/2016 55 $2000 0002 0037 Seedling Changed 0 0
  PLASMA.SYSTEM  Destroy Read Rename Write SYS  007 04/01/2016 04/01/2016 2,901 A=$2000 0002 002F Sapling Changed 0 0
ProDOS format; 112,640 bytes free; 30,720 bytes used.
Make .run
% make
plasm -AM < hello.pla > hello.a
acme --setpc 4094 -o hello.mod hello.a
cp template.dsk hello.dsk
java -jar AppleCommander.jar -d hello.dsk hello
hello: No match.
java -jar AppleCommander.jar -p hello.dsk hello rel 0x1000 < hello.mod
if [ ! -z "" ]; then \
		for o in ""; \
		do \
			java -jar AppleCommander.jar -d hello.dsk $o ;\
			java -jar AppleCommander.jar -p hello.dsk $o rel 0x1000 < $o.mod ;\
		done ;\
osascript plasma_run.scpt `pwd` hello
rm hello.mod hello.a
Including EXTRA
% ls -l timer.pla test.pla
-rw-r--r--  1 mfinger  staff   598 Apr  8 21:32 test.pla
-rw-r--r--  1 mfinger  staff  3710 Apr  8 13:26 timer.pla
% EXTRA=timer make test.dsk
plasm -AM < test.pla > test.a
acme --setpc 4094 -o test.mod test.a
plasm -AM < timer.pla > timer.a
acme --setpc 4094 -o timer.mod timer.a
cp template.dsk test.dsk
java -jar AppleCommander.jar -d test.dsk test
test: No match.
java -jar AppleCommander.jar -p test.dsk test rel 0x1000 < test.mod
if [ ! -z "timer" ]; then \
		for o in "timer"; \
		do \
			java -jar AppleCommander.jar -d test.dsk $o ;\
			java -jar AppleCommander.jar -p test.dsk $o rel 0x1000 < $o.mod ;\
		done ;\
timer: No match.
rm test.mod test.a timer.mod timer.a
% java -jar AppleCommander.jar -ll test.dsk

  PRODOS  Destroy Read Rename Write SYS  035 09/19/2007 05/06/1993 17,128 $0000 0002 0008 Sapling Changed 0 4
  CMD  Destroy Read Rename Write SYS  010 04/01/2016 04/01/2016 4,141 A=$2000 0002 0029 Sapling Changed 0 0
  TEST  Destroy Read Rename Write REL  001 04/08/2016 04/08/2016 423 $2000 0002 0037 Seedling Changed 0 0
  TIMER  Destroy Read Rename Write REL  003 04/08/2016 04/08/2016 927 $2000 0002 0039 Sapling Changed 0 0
  PLASMA.SYSTEM  Destroy Read Rename Write SYS  007 04/01/2016 04/01/2016 2,901 A=$2000 0002 002F Sapling Changed 0 0
ProDOS format; 111,104 bytes free; 32,256 bytes used.


Here is a video showing it using the “.dsk” and “.run” versions:

I wanted to compare PLASMA with CC65 on several different points. At this point, with my limited experience with PLASMA, I’ll just start with:

  • Easy of understanding/similarity
  • Speed

I took my “moving monster” test program and rewrote it using PLASMA to compare it to how I had it written in C.  Having read that PLASMA took some inspiration of it’s structure from modern languages, I was pleasantly surprised how similar the code for each is and how easy the port was. It actually helped me improve my C code a bit as well.

C code

// Put image on screen
void putImage(imageData *image, char page, char x, char y) {
    char b, f, r;
    // Convert X to byte offset
    b = xToByte[x];
    // Convert X to needed shift frame
    f = xToFrame[x] * image->height*image->width;
    // Draw frame line by line
    for (r = 0; r < image->height; r++) {;
        memcpy((char *)(hgrpage[page] + yToAddr[y + r] + b), image->data + f + (r * image->width), image->width);

int main() {
    int x = 0;
    int count = 0;
    // Clear both Hi-Res pages (Bad: Clearing holes too!)
    memset((char *)0x2000, 0, 0x2000);
    memset((char *)0x4000, 0, 0x2000);
    // Activate graphics
    POKE(-16304, 0);
    // Full screen graphics
    // Hi-Res graphics
    // Put initial image on non-displayed page so when we flip it's there
    putImage(&image, !page, 0, 30);
    // Move across the screen by 2
    for(x=2; x <= 200; x+=2) {
        // Flip page
        page = !page;
        POKE(showpage[page], 0)
        // Draw new image on non-displayed page
        putImage(&image, !page, x, 30);
        // Pause

    // Go back to page 0 (1)
    POKE(showpage[0], 0)

    // Text mode
    POKE(-16303, 0);



// Put image on screen
def putImage(imgdata, imgheight, imgwidth, page, x, y)
    byte b, f, r

    // Convert X to byte offset
    b = xToByte[x]

    // Comvert X to needed shift frame
    f = xToFrame[x] * imgwidth * imgheight

    // Draw frame line by line
    for r = 0 to imgheight-1
        memcpy(hgrpage[page] + yToAddr[y + r] + b, imgdata + f + (r * imgwidth), imgwidth)

// Clear both Hi-Res pages (Bad: Clearing holes too!)
memset(hgr1, 0, $2000)
memset(hgr2, 0, $2000)

// Activate graphics

// Full screen graphics

// Hi-Res graphics

// Put intial image on non-displayed page so when we flip it's there
putImage(@data, height, width, (!page&$01), 0, 30)

// Move across screen by 2
for x = 2 to 200 step 2

    // Flip page
    page = (!page&$01)

    // Drw new image on non-displayed page
    putImage(@data, height, width, (!page&$01), x, 30)

    // Pause
    for count = 1 to 500

// Go back to page 0 (1)

// Text mode

As you can see, they are very similar. Should be an easy move over for people familiar with C/Java and languages of that ilk. Very impressive.

Next I took a look at performance. When I originally started looking at comparing performance, I was shocked at the speed difference between the two (which I’ll show shortly). That was before I realized that I was wrong about PLASMA.

I was thinking that PLASMA was more of a “pre-assembler” or “pre-compiler” that took high level structures and generated 6502 assembly for the corresponding code. It actually produces byte-code that is then run under the PLASMA VM. This can be sped up by writing raw assembly for routines that need more power. Silly me.

Now, I don’t consider that a bad thing for the same reason I don’t consider it a bad thing for Java vs C. It’s just a different approach and both have their merits.

C Performance

PLASMA Performance

As you can see in the above videos, without some native assembly to do some of the heavy lifting where needed, the C compiled code runs much faster than the PLASMA code. With a byte-code VM, that is to be expected.

Again, I want to reiterate, this is not a bash on PLASMA at all. On the contrary, even with the little I’ve worked with it I’m very impressed with it and it’s an amazing piece of engineering. Especially doing a byte-code/VM on a 8-bit platform. Well done, well done.

I’m working on getting some timing routines in both the C side and the PLASMA side that will read from the No-Slot Clock, since it gives hundredths of seconds resolution. Then I’ll publish some exact numbers comparing the two. Again, not as a “C is faster/better” but just to show some of the trade-offs.

I decided as part of my efforts to get back into programming on my Apple ][‘s that I’d also explore other newer technologies that are available on the development side.

Thanks to a recent issue of Juiced.GS (Vol 21, Issue 1), I thought I’d try out PLASMA (Proto Language AsSeMbler for Apple) from Davis Schmenk. It (like it says) is a proto-assembly language that has a lot of features of modern language normally not available in assembly.  I’ve not dug into the language much beyond reading the article (“Programming with PLASMA: Developing a chat client”) in Juiced.GS and reading through some of the sample code, but it does look very interesting.

But, thanks to the great work on the Xcode build pipeline for C[AC]65 that I mentioned in an early post, I’m spoiled in having a quick build pipeline.  Write code, click build, watch it run.  So, I figured by “standing on the shoulders of giants” I’d put together a proof of concept way to do something similar with PLASMA.

Requirements are simple:  Write code, run a build, watch it run.

Digging into the work Quinn Dunki posted about here, I took  the Applescript code and the makefile and adapted it to work for what I needed.  I did it outside of Xcode for this case for a couple of reasons.  First is that Xcode won’t really understand PLASMA code in a way that is beneficial (no completion or highlighting) and second is that I don’t really like Xcode very much.  So, vi and make it is.  Makes me all nostalgic.

Here is my adapted Applescript code (Really only changed a – to a +):

-- Stolen/Adapted from: Blondihacks Makefile script for Virtual ][ (
-- Boots the disk image for the program and runs it inside PLASMA

on run argv
	set TARGETPATH to item 1 of argv
	set PGM to item 2 of argv

	tell application "Virtual ]["

		tell front machine
			eject device "S6D1"
			insert TARGETPATH & "/" & PGM & ".dsk" into device "S6D1"
			delay 0.5
			delay 0.5
			type line "+" & PGM
		end tell
	end tell
end run

Here is my makefile:

PGM?=$(shell basename $(SRC) .pla)



all: disk

run: disk
	osascript plasma_run.scpt `pwd` $(PGM)

disk: $(PGM).dsk

	-rm -f $(OBJ) $(PGM).a $(PGM).dsk

	-rm -f *.a *\#*

$(PGM).dsk: $(OBJ)
	cp template.dsk $(PGM).dsk
	java -jar AppleCommander.jar -d $(PGM).dsk $(PGM)
	java -jar AppleCommander.jar -p $(PGM).dsk $(PGM) $(DSKTYPE) 0x$(ADDR) < $(OBJ)

%\#$(TYPE)$(ADDR): %.a
	acme --setpc 4094 -o [email protected] $?

$(PGM).a: $(SRC)
	$(PLASM) -AM < $? > [email protected]

Again, this may be too limited at the moment as I don’t have a deep understanding of PLASMA and project structure, linking, etc.  But, for this case simply set the SRC environmental variable to point to your plasma code and run make.

Here is an example (Note: I’ve tweaked the makefile a bit since the video):

Now it’s time to start writing some of my own code and experiment with the language.

I’ve been using GO ( for the last several months and really like the language, which I can go into at another time.

Lately, one of the processes that I’ve written seems to get into a site where the CPU of the process is extremely high even though the process is basically in an idle state:

top - 13:06:53 up 152 days,  4:04,  1 user,  load average: 11.99, 11.30, 11.25
Tasks: 348 total,   1 running, 347 sleeping,   0 stopped,   0 zombie
%Cpu(s): 48.4 us,  2.4 sy,  0.0 ni, 49.2 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:  32900140 total, 32371752 used,   528388 free,       44 buffers
KiB Swap: 33509372 total,  2151948 used, 31357424 free. 22511692 cached Mem

16115 mfinger   20   0 2637812 1.105g   5972 S 616.7  3.5   3883:11 xxxxxx
16134 mfinger   20   0 2504232 728572   6128 S 610.1  2.2   2909:37 xxxxxx

After looking around, I remembered that go has profiling built in. I added a few lines to my code, namely:

import _ "net/http/pprof"


go func() {
       log.Println(http.ListenAndServe(":6060", nil))

Then I ran the profile tool built into GO:

% go tool pprof -png http://host:6060/debug/pprof/profile &gt; cpu.png
Fetching profile from http://host:6060/debug/pprof/profile
Please wait... (30s)
Saved profile in /Users/Mfinger/pprof/

Let’s look at the results:


Let’s look at the heap, as well:

% go tool pprof -png  http://host:6060/debug/pprof/heap > heap.png
Fetching profile from http://host:6060/debug/pprof/heap
Saved profile in /Users/Mfinger/pprof/

Very nice. Now to try to fix the issue.