I’ve been using GO (golang.org) for the last several months and really like the language, which I can go into at another time.

Lately, one of the processes that I’ve written seems to get into a site where the CPU of the process is extremely high even though the process is basically in an idle state:

top - 13:06:53 up 152 days,  4:04,  1 user,  load average: 11.99, 11.30, 11.25
Tasks: 348 total,   1 running, 347 sleeping,   0 stopped,   0 zombie
%Cpu(s): 48.4 us,  2.4 sy,  0.0 ni, 49.2 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:  32900140 total, 32371752 used,   528388 free,       44 buffers
KiB Swap: 33509372 total,  2151948 used, 31357424 free. 22511692 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
16115 mfinger   20   0 2637812 1.105g   5972 S 616.7  3.5   3883:11 xxxxxx
16134 mfinger   20   0 2504232 728572   6128 S 610.1  2.2   2909:37 xxxxxx

After looking around, I remembered that go has profiling built in. I added a few lines to my code, namely:

import _ "net/http/pprof"

And:

go func() {
       log.Println(http.ListenAndServe(":6060", nil))
}()

Then I ran the profile tool built into GO:

% go tool pprof -png http://host:6060/debug/pprof/profile > cpu.png
Fetching profile from http://host:6060/debug/pprof/profile
Please wait... (30s)
Saved profile in /Users/Mfinger/pprof/pprof.host:6060.samples.cpu.008.pb.gz

Let’s look at the results:

cpu

Let’s look at the heap, as well:

% go tool pprof -png  http://host:6060/debug/pprof/heap > heap.png
Fetching profile from http://host:6060/debug/pprof/heap
Saved profile in /Users/Mfinger/pprof/pprof.host:6060.inuse_objects.inuse_space.006.pb.gz

heap
Very nice. Now to try to fix the issue.

Reading about odd/even frames and bytes was pretty, well, confusing at the beginning.  Took me a few times to get through it and experiment, but I figured it out.

Firstly, there are two choices:

  1. Move one pixel at a time, which really turns into only moving every other cycle.
  2. Move two pixels at a time, which actually looks okay.

There really is no way to move 1 pixel and not change colors which, looking back (again), makes total sense.

Secondly, I figured out the odd/even frames and bytes logic.

In the book, he generates frames at bit shift of 0, 2, 4, 6 as frames 0, 1,  2, and 3 then generates frames at bit shift offset of 1, 3, and 5 as frames 4, 5 and 6.  Which means show even offset frames in even bytes and odd offset frames in odd bytes.  The tricky part is right at the middle of the frames at frame 3.

If we plot this out over 14 shifts (enough to get through both an even and odd byte)

  • X = 0/1 we show frame 0 (even byte, even offset frame)
  • X = 2/3 we show frame 1 (even byte, event offset frame)
  • X = 4/5 we show frame 2 (even byte, event offset frame)
  • X = 6/7 we show frame 3
    • Except 7 is in the odd byte, but if we move it to the first odd frame 4) then we show frame 3 for 1 cycle and show frame 4 for 3 cycles.
    • And we can’t put an even offset frame in an odd byte or the color will change.
    • The fix is that for X = 7, we actually just do X = 6 again.  Put frame 3 in the even byte
  • X = 8/9 we show frame 4 (odd byte, odd offset frame)
  • X = 10/11 we show frame 5 (odd byte, odd offset frame)
  • X = 12/13 we show frame6 (odd byte, odd offset frame)

Not sure that made it any clearer, maybe some code will.  I have a lookup table that you index into with your X value and you get back byte # and frame #.  Unlike the book, I generate the frames in bit-shift order to keep even/odd consistent between byte, offset and frame #.


char xToByteFrame[280][2] = {
{ 0, 0 },
{ 0, 0 },
{ 0, 2 },
{ 0, 2 },
{ 0, 4 },
{ 0, 4 },
{ 0, 6 },
{ 0, 6 },
{ 1, 1 },
{ 1, 1 },
{ 1, 3 },
{ 1, 3 },
{ 1, 5 },
{ 1, 5 },
...


Notice there are 8 entries that update byte offset 0 and 6 that update byte offset 1.  The second { 0 , 6 } handles the fix for X = 7.Screen Shot 2016-03-18 at 11.58.30 PM

You can see that we have 2 at each X off set.  This is a move from 0-13 moving by 1 pixel.  I put each frame below the previous one for comparison.

Here is the final product of the little monster man moving across the screen.  I opted to move 2 pixels at a time, I could have halved the delay between moves and moved by 1 but why copy unneeded data around.

Here is my main() code.

Screen Shot 2016-03-19 at 12.17.04 AM

putImage() takes care of figuring our what frame needs to be displayed based on the X value passed in.

More progress In the right direction.

Apparently, moving a (mostly) white object is as easy as I thought it was.  My tool generated the 7 needs frames and they progressed nicely across the screen.  The only tricky part is I need to turn an X value into two different values:

  1. Byte # in row
  2. Frame # to display

This was pretty easy (or so I thought, more on that below).  Take X divide it by 7 and round down to get the byte # in the row.  Take X mod 7 (i.e remainder) and you get the bit offset with in the byte which corresponds to the frame.  I’m worried that math is also too much work to do every movement, so I generated and lookup table for X to Byte/Bit but it’s 2 bytes for each column so that’s another 560 bytes for lookup tables.  Remember we’re working with things on the order of magnitude of 32-48k.  So that’s a total of 944 bytes for lookups, almost a whole K.  I’ll need to figure out which is better doing some testing, for now lookup table it is.

We’re good, right?  For non-white objects, no so much:

Reading further in the book (yes, I end up working ahead when perhaps I shouldn’t), looks like I need to handle odd/even frames for odd/even bytes differently.  Oh, the fun never ends.

Which, when I think about it, makes total sense. Here is the first frame as a bitmap:

Screen Shot 2016-03-18 at 9.07.22 PM

And here is the second:

Screen Shot 2016-03-18 at 9.07.38 PM

The first one is all on green pixels (the G at the bottom) and the second is all on violet pixels (the V). I could just move two bit at a time, the the second (displayed) frame would be:

Screen Shot 2016-03-18 at 9.19.12 PM

So, it’s back to green like I want.  That seems cheap like a 2-bit suit (Ok, I had to).  But, it does feel like cheating.  Maybe that is what I’ll need to do and what games do and we don’t know it, but I don’t think so.

Time to read more and see, looks like the parts I’ve moved through so far at the “easy parts”.  Figures.

 

(This post was repurposed and embellished from my FaceBook post).

Well, more baby steps. Did a bunch of reading in the arcade book about bitmap images and rendering. Learned a lot.

Realized I need 7 shift frames for each bitmap I want to move around since 1 byte in hi-res is 7 pixels.  That is because the expense to calculate the image shift of the fly will be complex and costly.  Remember, with the 7-bits and a high bit for color control (g/v vs b/o) you can’t simply shift each byte by one bit and be done.

112u8a

That’s because the high bit needs to be left alone so it doesn’t change color, so the 7th bit needs to moved the low bit of the next byte (if shifting right), etc.  Not to mention doing this EVERY time.

Time for graph paper? Umm, no… If necessity is the mother of invention, then laziness is the father. I wrote a tool to draw my bitmaps and added code to generate each of the 7 shifts.  Needs some cleaning up, but it works great.

Features:

  • Drawing the bitmap (duh)
  • Buttons to manually shift left/right
  • Button to clear
  • Generarates output data in textarea boxes. (C array for main frame, JSON for loading/unloading, C array for all frames).  These are updated in real-time.

 

The video shows my bitmap tool, the build pipeline in Xcode and the results running.

Proof is in the pudding they say. So, without further ado, the pudding.

Moving something around the screen needs to be fast, and drawing each line pixel by pixel isn’t going to cut it.  This is where bitmap graphics comes in.  Basically, drawing out the pictures pixel by picture before hand, keeping it in memory then copying it to the write spot on the screen for each frame of the movement.

We can do this with block moves over data, but can only move blocks of sequential data.  Each line (even if they were in sequential order on the screen) is at a different memory location.  Pixels byte 2 on line 2 are not sequential with byte 2 on line 3.  So, we can only block move in one line of the image at time.  To complicate things we need to figure out where in memory each line starts since they are not in order.

We can do a couple of ways.  Compute the start address of the line, which will take a lot of instructions include division and multiplication (remember we’re on an 8-bit machine here at 1Mhz) or we can do look up tables.  A lookup table is basically a list of 192 addresses (the number of lines in Hi-Res) in line order. that we can index into with our Y coordinate.  Takes up 384 bytes of memory, but saves us a bunch of time.

Here is the formula from the book:

Screen Shot 2016-03-18 at 6.54.53 PM

Needless to say, I went through and wrote something to dump out all the line addresses and generated the lookup table.  You’ll notice “SN” above referring to Hi-Res page 1 or 2.  The Apple ][ screen has two pages, only one of which can be visible at a time.  To avoid flickering images when moving them, it’s a common (not trivial) to erase/redraw on the page not being displayed, then switch which page is visible, then repeat.

Then I took a bitmap example from the book and wanted to get it moving across the screen.  Since I’m learning here and wanted to do it the right way, I decided to also do the page flipping.

You can see my first attempt:

It’s not the cleanest, but not too bad.  At least it moves and I can tell the pages are flipping (the garbage at the bottom).

But, notice that it’s moving jerky.  That’s because I’m moving it a whole byte (7 pixels) each frame instead of moving individual pixels.  That’s going to take some more work, but at least we have movement.  Baby steps.

I decided to get back into working on my game, but this time I decided I should learn to program graphics on the Apple ][ as part of it.  Something I never really did what I was younger.  I’m follow the Apple ][ Enthusiasts group of FaceBook so I reached out to them about pointers to books, sites, and tutorials for it.  I’ve see some people in the group do some pretty amazing things.  I had a lot of suggestions and tips.

One book that people recommended was “Apple Graphics & Arcade Game Design” by Jeffery Stanton.  (You can find it on the Internet Archive, if you are interested).  I started reading that on Hi-Res graphics and learned a ton.  I highly suggest the book.

One main thing I learned, is that Hi-Res graphics is that it’s not easy.  The way the hardware designers laid out the hardware (to keep costs down) was pretty much a mess.

Lines are not sequential in memory.  It’s hard to explain, but if you’ve ever seen a Hi-Res image load you can see the “interlacing” effect when it loads.  Here is an image from the book the illustrates writing sequential data to the screen:

Screen Shot 2016-03-18 at 5.57.40 PM

Now, if that isn’t bad enough.  One byte in memory contains 7 horizontal pixels, bits 1-7. And the bits alternate colors within the 7 bits.  It’s a mess really.  It makes me even more impressed at how game developers in the day were able to make the amazing games they did.  My hats off to them.

Back to the 7-bits of pixels.  It gets more fun.  In even bytes the pixels alternate violet-green, in odd bytes they alternate green-violet. Well, unless the high bit is on then they alternate blue-orange and orange-blue respectively.  But, if you turn on bits 1 and 2, you get a 2 pixel white line. If you turn on bits 1 and 3 you get a 3 pixel line of violet, 2 and 4 you get a 3 pixel line of green.  Clear as mud?

Here is a picture:

Screen Shot 2016-03-18 at 6.10.48 PM     Screen Shot 2016-03-18 at 6.14.42 PM

Lines in order (second image has high bit set in each byte)

  1. Bit 1 on
  2. Bit 2 on
  3. Bit 1 and 2 on
  4. Bit 1 and 3 on
  5. Bit 2 and 4 on
  6. Bit 1, 2, and 3 on
  7. Bit 1, 2, 3, and 4 on

Also notice, you need to choose between using green/violet in those 7 pixels or orange/blue.  Not both.

Enough of this, let’s draw something.

I wanted to work on a game for the Apple ][, mainly because it gives me something to focus on while getting back into programming on the Apple ][.

In my younger days, I did most of my programming in BASIC with a little dabbling in assembly.  I figured now is the time to move forward to using something faster.  I’m a long time C developer, so that seemed natural.

Enter CC65.  CC65 is a cross-compiling toolchain to the 6502.  Basically, that means I can develop and build on a modern machine and the output is binary code that will run on retro machines.  In my case, the Apple ][.

In the beginning, I’d write code in a text-editor, compile it, place it on a disk image and boot that disk image in an emulator.  Slow going, but it was just playing around so it didn’t matter too much.

Fast forward and time to get serious.

Some of the people in the Apple ][ community put together a build pipeline for Xcode that make this all nice and smooth.  Develop in on modern IDE (Xcode), click build and it compiles, puts it in a disk image, boots Virtual ][ and runs the program.  Very nice.

Several months back, I had an idea for a game I wanted to do for the Apple ][.  For now, I’m not saying what the game is but it’s a puzzle game.  I wrote the engine and had it working fine, but it was basically taking the game board plus the moves then says “WIN!” or “LOSE!”.  Not real exciting.

I started working on the visual aspects of it and started with just moving test around to simulate what would be happening in the game.  I didn’t very far but what I did put in worked okay but, as you can guess, was pretty lackluster.  Plus, I lost interest and got busy.

Fast forward again.

 

As I work through getting back into retro-programming, I thought I might as well document it here.

That way, anything I learn I can easily share with others that might also take it on.  And, really, a place for me to keep things so I don’t forget them.