Instruction setCopyright © 1996 Creative Micro Designs, Inc.
Reprinted with permission from Commodore World magazine, Issue #16.
A 6502 Programmer's Introduction to the 65816
by Brett Tabke
After programming in 6502 language for over a decade, I was getting a bit BORED. One can only code the same routines with the same opcodes so many times before the nausea of repetition becomes overpowering. When I heard the news that CMD was building a cartridge based on a 20 MHz 65816 I was overjoyed. For years I've heard those with 65816 bases systems brag about its capabilities. To us old 6502 programmers, the opportunity to program the fabled 65816 is a new lease on life.
The 65816 is an 8-/16-bit register selectable upgrade to the 6502 series processor. With 24 bit addressing of up to 16 Megabytes of RAM, the powerful 65816 is a logical upgrade that leaves 6502 programmers feeling right at home. It is amazing how fast one can adapt to the new processor. It sounds funny to say it, but the only difficulty I have had learning the 65816 is that there are so many options and choices to complete the same task, that it is hard to decide which method is best.
To get started programming the 65816, I would recommend purchasing the book, "Programming the 65816" from The Western Design Center, manufacturer of the 65816. While it is a bit pricey, the sheer quality and content of the 600 page book is worth the money. Rarely, if ever, has there been a CPU manual as thorough and detailed as the Western Design book. If you know 6502 assembly, then Programming the 65816 is probably the only 65816 book you will ever need.
Getting a Feel for the Modes
The 65816 may be operated in Native mode or 6502 Emulation mode. Emulation mode is a 100% 6502 compatible mode where the whole processor looks and feels like a vintage 6502. Native mode offers 8- or 16-bit user registers and full access to 24-bit addressing.
While in emulation mode, not only are all the 6502 opcodes present in their virgin form, but the new 65816 instructions are also available for usage. In fact, the first lesson to learn about programming the 65816 is that emulation mode is much more powerful than a stock 6502. The only true difference between emulation mode and our venerable C64's 6510 processor is that unimplemented opcodes will not produce the results expected on the former. Since all 256 of the potential opcodes are now implemented on the 65816, older C64 software that uses previously unimplemented opcodes will produce erratic results.
To select between emulation and native modes, a new phantom hidden emulation bit (E) was added to the status register. Shown in programming models hanging on top of the Carry bit, the emulation bit is only accessible by one instruction. The new instruction (XCE) exchanges the status of the Carry bit and Emulation bit. To move to emulation mode, set the carry and issue an XCE instruction. To move to native mode, clear the carry and issue the XCE instruction.
My, How Your Index Registers Have Grown!While in native mode there are two new directly accessible bits present in the status register. The 65816 implements new hardware interrupt vectors which include a new hardware BRK vector in ROM; therefore, the old BRK bit of the status register is no longer needed. The BRK bit is replaced with the X bit to select either 8- or 16-bit index registers. The former empty bit 5 is now filled with the M bit to specify the accumulator and memory access as 8- or 16-bit.
Two new instructions are used to clear or set bits within the status
register. The SEP instruction sets bits, and REP clears bits. SEP and
REP use a one byte immediate addressing mode operand to specify which
bits are to be set or cleared. For example, to set the X bit for 8 bit
|SEP #%00010000||; set bit 4 for 8-bit index|
Or to clear bit 4:
|REP #%00010000||; clear bit 4 for 16-bit index|
When in 8 bit mode, the index registers perform their function in standard 6502 form. When status bit X is set to 0, both the X and Y index registers become 16 bits wide. With a 16-bit index register you can now reach out to a full 64K with the various indexed addressing modes. An absolute load to an index register in 16-bit mode will retrieve 2 bytes of memory-the one at the effective address and the one at the effective address plus one. Simple things like INX or DEY work on a full 16 bit, which means you no longer have to specify a memory location for various counters, and loops based on index counters can now be coded in a more efficient manner.
The formerly empty status register bit 5 is now referred to as bit M. M is used to specify an 8- or 16-bit wide acculmulator and memory accesses. When in 8 bit mode, (M=1), the high order 8 bits are still accessible by exchanging the low and high bytes with a XBA instruction-it is like having two acculmulators! However; when set for a full 16-bit wide accumulator, all math and accumulator oriented logical intructions operate on all 16 bits! If you add up the clock cycles and bytes required to perform a standard two byte addition, you can start to see the true power of 16-bit registers.
More Register Improvements
Zero Page has now been renamed to Direct Page-corporate thinking, go figure. A new processor register D was added to allow Direct Page to be moved anywhere within the first 64K of memory. The direct page register is 16 bits wide, so you can now specify the start of direct page at any byte. Several old instructions now include direct page addressing as well. To move direct page, just push the new value onto the stack (16 bits) and then PLD to pull it into the direct page register. You may also transfer the value from the 16-bit accumulator to the direct page register with the TCD instruction. Direct page may also be moved while in emulation mode.
While in native mode, the stack pointer is a full 16 bits wide, which means the stack is no longer limited to just 256 bytes. It can be moved anywhere within the first 64K of memory (although while in emulation mode, the stack is located at page one). There are several new addressing modes that can use the stack pointer as a quasi-index register to access memory. Numerous new push and pull instructions allow you to manipulate the stack. A few of the more useful stack intructions useful to programmers, are the new instructions to push & pull index registers with PHX/PHY and PLX/PLY.
Two other new processors registers are the Program Bank Register (PBR) and the Data Bank Register (DBR). The Program Bank Register can be thought of as extending the program counter out to 24 bits. Although you can JSR and JMP to routines located in other RAM banks, individual routines on the 65816 still must run within a single bank of 64K-there's no automatic rollover from one bank of RAM to the next when executing successive instructions. In this sense, it may help to think of the 65816 processor as a marriage of Commodore's C128 Memory Management Unit (MMU) and an 'enhanced' 6502-a very similar concept.
The Data Bank Register is used to reach out to any address within the 16 megabyte address space of the 65816. When any of the addressing modes that specify a 16-bit address are used, the Data Bank byte is appended to the instruction address. This allows access to all 16 megabytes without having to resort to 24-bit addressing instruction, and helps enable code that can operate from any bank.
New Addressing ModesThere are new addressing modes on the 65816. Several new instructions are designed to help relocatable code that can execute at any address. The use of relocatable code on the 6502 was extremely limited. With 16 megabytes of address space, writing relocatable code increases the overall utility of the program. To write relocatable code, several new instructions use Program Counter Relative Long addressing. This allows relative branching within a 64K bank of RAM. There's also Stack Relative addressing, and a push instruction to place the program counter onto the stack, so that a code fragment can pull it back off and can instantly know its execution address.
Another new feature are two Block Move instructions, one for forward MVP and one for backward MVN. Simply load the 16-bit X register with the starting address, the Y index register with the ending address, the accumulator with the number of bytes to move, and issue the MVP or MVN instructions. MVN is for move negative, and MVP is for move positive, so that your moves don't overwrite themselves. Block Moves use two operand bytes: one for the source bank of 64K and one for the destination bank. Memory is moved at the rate of seven clock cycles per byte.
Several new addressing modes are used to access the full address
space. A 65816 assembler would decode "long" addressing given this input:
|LDA $0445F2||; load byte from $45F2 of RAM|
|; bank 4|
|LDA $03412F,x||; load byte from $412F of bank 3|
|; plus x.|
Quite a few instructions have been given new addressing modes. How many times have you wanted to do this:
|LDA ($12)||; load indirect without an|
Or how about a table of routine addresses:
|JSR ($1234,x)||; jump to a subroutine via|
|; indexed indirect addressing!|
Other fun new instructions:
|TXY,TYX||Transfer directly between index registers|
|BRA||Branch always regardless of status bits|
|TSB||Test and set any bit of a byte|
|TRB||Test and reset (clear) any bit of a byte|
|INC A/DEC A||Increment or decrement the accumulator directly|
|STZ||Store a zero to any byte|
Summing UpAs you can see, the 65816 opens up a whole new world of programming-it feels like a new lease on life. Of course, it's going to take some time to learn the new processor. But while the 20 MHz speed is a nice perk, I believe that the real power of CMD's new peripheral is indeed the engine under its hood: the 65816-a super CPU!
Native Mode Options
While in Native Mode, the m flag controls the size of Accumulator A and most Memory Operations, while the x flag controls the size of the X and Y Index Registers. This provides 4 different configuration possibilities, as charted below. The REP and SEP instructions are used in combination to swith configurations.
It is important to note that the m flag will control the size of all operations dealing with memory except in operations involving the X and Y Index Registers (CPX, CPY, LDX, LDY, STX and STY) when the x flag controls the size.
While in Emulation Mode, Accumulator A is forced to 8-bit mode. You can, however, access the upper 8 bits with instructions that specify Accumulator B, and all 16 bits at once with instructions that specify Accumulator C. The X and Y Index Registers are also forced to 8-bit mode, with no means available to access the upper 8 bits. To further assist in compatibility, the Stack is forced to Page 1 of Bank 0. The Direct page Register (D) is fully functional in this mode, allowing direct page to be placed anywhere in Bank 0. Likewise, the Program Bank Register (PBR) and Data Bank Register (DBR) are also fully functional. While it would seem that these latter items would allow programs to operate from any bank in Emulation mode, there are some caveats; interrups will force the program bank to zero without saving the PBR first, and RTI won't attempt to restore the bank. Therefore, Native mode would be recommended to execute programs in other banks.