Thursday, 23 March 2017

NES Homebrew Part 3: On the move

So the next part to tackle is the moving eggs. Moving one isn't so bad, but there's the issue of what happens if they collide. the basic sequence of events is this:

  • Player pushes an egg
  • It continues moving until it hits another egg
  • The previously moving egg stops and the hit egg starts moving
  • This process continues until...
  • ...an egg hits a wall.
  • The egg then reverses direction until it ends up at its home position (which should be next to the previous egg).
  • The previous egg then begins to move....
  • ...until the first one arrives at its home position again.
So how do I manage all of this? Well I figured that a stack-based approach would be best. When a push starts the stack will be empty and a moving object is put into the stack and the background object is removed for the moment. As soon as it hits another egg then the process is reversed, and we remove the moving object and place a background object at that position. Another moving object is created in place of the next egg and it's placed at the next slot in the stack. when an egg hits it's home position again it is converted to a background item once and for all, and the stack unwinds one step. If the last stack item needs removing then the stacksize is set to 255(0xff) to signify that it's empty and no further processing is needed.

Simple enough? Well yeah aside from the usual 'It's all in 6502' type issues. I get around this by having a single-byte stack pointer in the zeropage and using the start of the hardware stack page for my egg movement stack. Each egg description in the stack is only 4 bytes, and with a maximum of 12 on a row or column (13 - where the player is standing) this means we have a worst-case scenario of 48 bytes, leaving 216 bytes for the stack. I can't see it ever going over 128 bytes to be honest so I can rob more of the page if needed.

All addressing is done by declaring the structure elements relative to the base of the stack, loading the stack pointer into the X register and referring to the elements with LDA/STA egg_element,x.

The egg_type element contains the current type in the lower nybble (same way the level data was defined to start with) with the movement direction encoded into 2 bits of the upper nybble, giving us a nice round stack element size that isn't too slow to manipulate.

I'm still a couple of functions short - the graphics update stuff isn't plugged into the right points yet, but stepping through the stack handling and movement code looks like it's working. I have the other code I need but I've written it in a way that makes it a ballsache to reuse, so it looks like some refactoring is needed...


Wednesday, 15 March 2017

NES homebrew part 2: On map formatting

There's always tradeoffs to be made


Generally speaking code can have a small footprint, or be fast. Sometimes (but not as often as you'd want) you can get lucky and have both.

The level data in this case is one of those things. There are currently less than 8 types of tile on a 13x13 grid. This means that in theory we only need 3 bytes per block, and with the map being 169 tiles this would be 507 bits (64 bytes, rounded up). So what options do we have?

  • 3 bits per tile - This would be the smallest representation, but as the start of each tile would start at an odd place in a byte and may even cross into the next byte along. This lands us straight into the roughest end of speed vs size territory as you'd need to find the start of the 3 bits, which may be one or 2 byte reads, shift them into the bottom 3 bits of a byte, maybe swapping and combining part-bytes. We also have the overhead of logic to figure out if these special cases happen, which takes time, and then we can use them. This is the worst case scenario.

So... we're pressed for both memory and time, but we're going to have to choose one. Let's look at the wastage options, shall we?

  • 4 bits per tile - Not quite as small in terms of representation at 676 bits (85 bytes), but we can look at it as expansion room for future tile types. Tiles still need expanding from their positions within the byte, but they alternate between the top and bottom half of each byte, with no crossings so it would be quicker and simpler at least.
  • 8 bits per tile - This brings us up to the full 169 bytes for a level description. As far as reading out the data is concerned we would be wasting half of each byte for storing the tile number, but no unpacking would be needed. We can even turn the negatives into a positive and use the top half of the byte to hold other information about the tile (such as collision information) so you get everything in a single read and you can just AND-mask off the bits you don't need for drawing. Even random access is easy - just look up MAP_START_ADDRESS+X+(Y*13) and you'll get the byte. We're sorted!

oh. wait... Not quite that simple.

Y*13 now becomes our enemy. In the relative comfort of C++ it's easily done. On a modern cpu, or even a 16-bit 68000 or 32-bit ARM you can just use a MUL instruction, but on a machine with no multiply how do you even do this?

You could just have a lookup table with 13*0,13*1...13*12 in it, but in 6502 we only have 3 registers, 2 can be used for indexing into lists like this but they're probably busy doing other things anyway. The other way is to use shifts. Even if you don't code you've probably done this subconsciously by multiplying by 10,100,1000 or so by just moving everything left and adding zeroes onto the end. Binary works the same only shifting left multiples by 2,4,8,16,32... depending how far you go.
What about values that aren't nice binary values, such as 13, though? Well - you just find the binary shift values that make up the add number and add the result of doing those individual shifts together:

13 = 8 (shift left 3) + 4 (shift left 2) + 1 (original value). So we can do those 2 shifts, add them together. Oh - and do a bunch of swapping and putting stuff in and out of memory because as I'm sure I mentioned before we only have 3 registers... my head aches just thinking about it. This brings us to one further option:

  • 8 bits per tile, but with each line padded out to 16 bytes. We're now up to 208 bytes per level map, but we can find any given tile with (Y shifted left 4) + X. Fairly quick and decent. Even better is that if we have a pixel position on screen each tile is 16 pixels high, so to turn that into a block position we'd have to divide the Y position by 16, then multiply it by 16 again. Those two cancel out and can be replaced by just clearing the bottom half of the Y position by ANDing it with 0x0F. Even quicker.

The two-way split

Generally as I think about game code, whether it's 64k of assembler running at 1Mhz or thousands of lines of C++ in xcode I mentally divide code into 2 camps: gametime and non-gametime.

  • Gametime is basically when you're playing, or an animation is running and you really need everything to by silky-smooth and as fast as it can go. If something happens in gametime it needs to favour speed. If it can not happen in gametime that would be even better,
  • Non-gametime. This covers loading time, static screens being shown, and basically any time where responsiveness is less of an issue. Anything that happens during this time can take a size-over-speed bias.
We also need to have the game data whilst playing in RAM, as the level might change as we're playing it due to things being destroyed or moved. So this means we can choose 2 of the above options - one for storing the levels that aren't being played (in the cartridge ROM in this case), and another for what is called the 'inplay buffer' in the code, which is a working copy of the level in the RAM.

Currently the inplay buffer uses the 16x13 layout, as it is almost exclusively used during gametime so speed is of the essence however I've yet to decide on the ROM layout format. I'm tending towards the 4 bits per tile layout. This would mean stripping the collision information out of each tile byte, but I can add that back in when I unpack the level to the RAM. It takes a little time, but this is non-gametime so I'm okay with that. It also assumes that each tile type would always have the same collision information but consistency is good in gameplay so I can live with that too.

This means in a single 16K cartridge bank I can hold 192 screen definitions, which is overkill and doesn't factor in other level definition info, such as enemy placements, timers or other per-level adjustments I might come up with as I go along.

Not so important on the NES, but if I backport to other machines later on then this might gain more importance.

Back to the NES

This weekend would've been great - things warmed up outside, the sun came out (and I had to look on Wikipedia to see what the glowing thing in the sky was, it's been so long), the beach bars were having their grand opening for the year 10 minutes walk away...

...and fate rewards me with a heavy cold that felt like the not-at-all fun bits of being drunk and a 3-day hangover combined into one handy package. Fantastic.

So like any normal geek who can count to 1023 on his fingers I grabbed a Lemsip, curled up on the sofa and started investigating Unreal Engine 4 and getting something out of it that didn't make my old backup PC and netbook scream in pain. The requisite hacks I found online required multiple recompilations which took a not insignificant amount of time.

So in the meantime I turned to WUDSN and the Nintendo NES toolchain. I also remembered discussing ideas way back with someone about an old arcade game where you pushed eggs around to free the birds trapped within whilst avoiding the enemies patrolling the screen. As a test of doing an old style single-screen game (Like the original NES classics, such as balloon flight) I started to implement the basis of something similar on the NES:


What I currently have is a hardcoded test level, which is somewhere close to the final (decompressed) format I want to use, a player sprite composed of 2 16x8 hardware sprites, all the hardware initialization and RAM clearup code and Shiru's Famitone sound player handling the sound. The collision detection is in, although my (deliberate) non-handling of diagonal joypad presses doesn't feel as fluid as I hoped when moving around, so I'm going to go back and redo that.

On the Actual hardware front the initialization, bankswitching (for the 64k cartridge image) and NMI-based Picture Processing Unit (PPU) update code means that the game works properly on hardware. It should also be able to determine if it's running on PAL/NTSC and adjust movement speeds and music pitch accordingly.

All the game logic is in 6502 assembly, however the collision routines and logic all happens in game-space (as opposed to screen coordinates) and only references the machine-agnostic game data with things being transferred across to the hardware as needed so hopefully if this thing gets finished I can port it across to any other 6502-based machines afterwards.

Seeing as I got this far in a lazy day I figure I might as well keep on with it and see how it goes. Hopefully documenting it as I go will motivate me.

Next task - handling the egg pushes. Pushing an egg is easy - it moves until it hits a wall, then bounces back to it's original position with 1 hit of damage taken. The harder part is if it hits another egg, in which case it causes a chain reaction until the last egg in the chain hits the wall and then everything bounces back into place in reverse order. This would be easy enough to do in C, but in 8-bit assembly with 3 registers and a clockspeed of under 2MHz this is going to take a bit of thinking about how to do it for the best.

Tuesday, 14 March 2017

Where was i...?

or 'yet another 'sorry for my absence' type of post'.

I know... Real life again. Things have been a bit thin on the 8/16-bit front for a while as I've been a wee bit busy working for those fine folks at Feral Interactive getting other peoples games up and running on MacOS. Then, just as I settled into that a job offer for my girlfriend came up in the Netherlands, so there was the small matter of hauling my Mac (and a cross-section of 80s and 90s machines) across to the Hague, settling in there, trying to learn the language. This brings me to my top tip for Dutch people - If you'd like us to actually do this (and some of us think that it'd be pretty gezellig thing to do) maybe you could stop being so helpful and speaking perfect English to us? :)

So I've finally gotten around to coding a mixture of new and old stuff again, depending on mood.

I've not done absolutely nothing though - there has been the odd bit of music here and there.


This raster split-tastic intro by Cosine has a Cover of 'Bliss' by Muse. I was suprised by how well this worked out actually.

Important note: I made a little typo on the copyright note, so if you take it literally then Muse covered me. I don't think I need to point out that that's not the case.

I liked this one...


...In fact I liked it so much I ported it over to the Atari 8-bit and we used it again :)

What can I say? What you lose in instrument quality you gain with that 4th channel I really could've done with on the SID.

Back again on the subject of covers of stuff I like I did a cover of 'Here comes your man' by the Pixies, which got the usual splendid DYCP scroll, logos and raster split treatment.

And 'In the interest of balance' as the BBC like to say yet another cover done for this years AtariAge new years disk. This was a reworking of a tune I remembered hearing back in the 90s bundled with various DOS Adlib music editors, forgot about and then it turned up in my youtube playlist. So I had to do something to get the damn thing out of my head again.


Then there's the matter of work stuff, Which is more numerous but it gets messy in terms of attribution. Not only because it's porting work (So more like translation than writing, and not my original work to start with), but also because there's games I've done 'myself', games I've helped with, common company code that I've written that's in every game since, and the odd fact that my favorite game I can (vaguely) lay claim to is the one where I wasn't officially working on it, but someone found code I'd written helped, cut+paste it in, and left me a thank you in the bugtracker. Life is strange sometimes...

Even in the '100% me' games there's the aforementioned huge chunk of shared code without which I'd get nowhere, so I can't rightfully claim 100% of anything. But hey! that's modern development for you.