Very Basic Machine Simulator ============================ See also: http://devcry.heiho.net post May 2012 or: http://devcry.blogspot.com There are several designs in this file. Scroll down for the good stuff. dated May 14 2012 Mind that this is work in progress and I may not update this file on the web. So you may be reading an outdated spec. Don't worry about it, this thing is just for fun anyway. Tip: I started out by writing a disassembler. The disassembler decodes instructions, so from there you can build a simulator. Writing an assembler isn't too hard but gets tedious when you have lots of different addressing modes and lots of little 'exceptions to the rule'. -------------------------------- Machine #1 32-bit computer, but with 16-bit base opcodes 16 general purpose registers: r0 - r15 sp register pc register sr register with flags: Z C N V P T I Opcodes ======= 5 bits base instruction: 00 ld indirect/with update 01 st indirect/with update 02 ldm /with update 03 stm /with update 04 mov: reg/imm 05 add: reg/imm 06 sub 07 adc 08 sbc 09 mul 0a div 0b umul 0c udiv 0d and 0e or 0f xor 10 cmp 11 tst 12 jmp/jsr + condition 13 push / pop 14 sp related instructions 15 movsx / movzx 16 shift by register 17 shift by value 18 bit testing instructions 19 int / wait 1a .. undefined 1b .. undefined 1c .. undefined 1d .. undefined (simd future extension?) 1e .. undefined (virt/security future extension?) 1f floating point instructions (reserved future extension) instruction mode reg1 reg0 immediate or reglist | | | | | 5 bits 3 4 4 16 or 32 mode: 0-3 .b, .w, .l register direct 4-7 .b, .w, .l immediate ld/st and ldm/stm mode: 0-3 .b, .w, .l register indirect 4-7 .b, .w, .l register indirect with update Note: there is no pre-decrement, only post-increment ldm/stm reglist in 20 bits: reg1|imm16 (note: high bit unused) push/pop mode: 00 push single (regnum in 8 bits: reg1|reg0) (note: many bits unused) 01 pop single 02 push reglist (reglist in 20 bits: reg1|imm16) (note: high bit unused) 03 pop reglist 4-7 unused sp mode: 00 ld sp, [reg0] 01 st [reg0], sp 02 mov sp, reg0 03 mov sp, imm32 04 add sp, reg0 05 sub sp, reg0 06 add sp, imm24 (reg1|reg0|imm16) 07 sub sp, imm24 jmp/jsr mode: 00 jmp reg0 01 jmp [reg0] 02 jmp imm32 03 jmp [reg0+imm32] 04 jsr reg0 05 jsr [reg0] 06 jsr imm32 07 jsr [reg0+imm32] reg1 is branch condition based on flags: Z C S O or Z C N V 00 always 01 never (unused; 'nop' can be encoded as jmpnv or jsrnv) 02 eq or z 03 ne or nz 04 c or cs or hs unsigned higher or same or jae 05 nc or cc or lo unsigned lower or jb 06 s or mi 07 ns or pl 08 o or vs 09 no or vc 0a gt Z=0 and S=O 0b ge S=O 0c lt S!=O 0d le Z=1 or S!=O 0e hi unsigned higher or ja 0f ls unsigned lower or same or jbe * there are no 'short' jumps * ret is encoded as 'pop pc' movsx/movzx mode: 0-1 .b, .w movsx register direct 2-3 .b, .w movzx register direct 4-5 .b, .w movsx immediate 6-7 .b, .w movzx immediate bit testing mode: value is reg1, but can be only 0..7 (8 bits, highest bit of reg1 is unused) mode is: 00 set [reg], value 01 clr [reg], value 02 bit [reg], value 03 tas [reg], value 4-7 unused register shift mode: 00 lsl reg, reg 01 lsr reg, reg 02 asr reg, reg 03 ror reg, reg 4-7 unused instruction mode value reg0 | | | | 5 bits 2 5 4 value shift mode: value is reg1|1 bit of mode mode is: 00 lsl reg, value+1 01 lsr reg, value+1 02 asr reg, value+1 03 ror reg, value ; rrx is encoded as ror reg, 0 int mode value | | | 5 3 8 bits int/wait mode: 00 int number 01 wait (waits for any interrupt) 2-7 unused -------------------------------- Machine #x ... unfinished 32-bits ARM-like, conditional instructions This one is not complete ... it started out as nearly a copy of the classic ARM but didn't finish it. I left it in here for educational reasons and because ARM is fascinating. Criticism IMHO, putting a condition on every instruction is cool but not very practical. I mean, I believe that the majority of instructions would not use a special condition. Leaving out the condition field would free up 4 bits of space. The barrel shifter is very clever. In practice you only need the shift-left operation to aid loading large constants, so you could do away with the 3 bits for encoding the barrel shift mode. This would free up 3 bits of space. Opcodes: unused instruction size condition reg reg reg shift value | | | | | | | | | 1 bit 5 bits 2 4 bits 4 4 4 3 5 bits * unused bit reserved for future extension (more instructions) * true load/store architecture * 3 active registers per instruction * optional barrel shifter * immediate mov can use lower 16 bits or there could be a separate mnemonic for an instruction that uses 11 bits (fields: reg,reg,shift) and does a barrel shift left with value * r14 = sp, r15 = pc (or else a lot of things couldn't work) * jump range 2**22 * 4 = +/- 8 MB or +/- 16 MB if you include the unused bit or make a jmp instruction that has no condition field barrel shifter mode: 00 no shift 01 reg, reg, reg, lsl #value 02 reg, reg, reg, lsr #value 03 reg, reg, reg, asr #value 04 reg, reg, reg, ror #value 05 reg, reg, reg, rrx #value 06 .. undefined 07 .. undefined -------------------------------- Machine #2 64-bit computer, but with 16-bit base opcodes This one is the 64-bit successor of machine #1 in the top of this document. 16 general purpose registers: r0 - r15 sp register pc register sr register with flags: Z C N V P T I Opcodes ======= 5 bits base instruction: 00 ld indirect/with update 01 st indirect/with update 02 ldm /with update 03 stm /with update 04 mov: reg/imm 05 movsx 06 movzx 07 add 08 sub 09 adc 0a sbc 0b mul 0c div 0d umul 0e udiv 0f and 10 or 11 xor 12 cmp 13 tst 14 jmp / jsr + condition 15 push / pop 16 sp related instructions 17 lsl 18 lsr 19 asr 1a ror 1b set / clr / bit / tas 1c int / wait 1d .. undefined 1e .. undefined 1f floating point instructions (reserved future extension) instruction mode reg1 reg0 immediate or reglist | | | | | 5 bits 3 4 4 16 or 32 or 64 bits mode: 0-3 .b, .w, .l, .q register direct 4-7 .b, .w, .l, .q immediate * movsx/movzx with .q is meaningless, but does exist ld/st and ldm/stm mode: 0-3 .b, .w, .l, .q register indirect 4-7 .b, .w, .l, .q register indirect with update Note: there is no pre-decrement, only post-increment ldm/stm reglist in 20 bits: reg1|imm16 (note: high bit unused) push/pop mode: 00 push single (regnum in 8 bits: reg1|reg0) (note: many bits unused) 01 pop single 02 push reglist (reglist in 20 bits: reg1|imm16) (note: high bit unused) 03 pop reglist 4-7 unused sp mode: 00 mov sp, reg0 01 mov reg0, sp 02 mov sp, imm64 03 undefined (space for: mov pc, lr ??) 04 add sp, reg0 05 sub sp, reg0 06 add sp, imm24 (reg1|reg0|imm16) 07 sub sp, imm24 jmp/jsr mode: 00 jmp reg0 absolute jmp to register loaded address 01 jmp pc+imm20 pc-relative jmp (signed) within 512 KB range 02 jmp pc+imm36 pc-relative jmp (signed) within 3 GB range 03 jmp imm64 absolute jmp 04 jsr reg0 absolute jmp to register loaded address 05 jsr pc+imm20 pc-relative jmp (signed) within 512 KB range 06 jsr pc+imm36 pc-relative jmp (signed) within 3 GB range 07 jsr imm64 absolute jmp imm20 = reg0|imm16 and imm36 = reg0|imm32 * since pc always aligned on a word, pc-relative branching could have a range twice as large (FIXME?) reg1 is branch condition based on flags: Z C S O or Z C N V 00 always 01 never (unused; 'nop' can be encoded as jmpnv or jsrnv) 02 eq or z 03 ne or nz 04 c or cs or hs unsigned higher or same or jae 05 nc or cc or lo unsigned lower or jb 06 s or mi 07 ns or pl 08 o or vs 09 no or vc 0a gt Z=0 and S=O 0b ge S=O 0c lt S!=O 0d le Z=1 or S!=O 0e hi unsigned higher or ja 0f ls unsigned lower or same or jbe * ret is encoded as 'pop pc' bit testing mode: value is reg1, but can be only 0..7 (8 bits, highest bit of reg1 is unused) mode is: 00 set [reg], value 01 clr [reg], value 02 bit [reg], value 03 tas [reg], value 4-7 unused bit shift/rotate instructions: instruction mode value reg0 | | | | 5 bits 1 6 4 bit shift mode: 00 shift reg0, reg1 reg1 is encoded in value; 2 bits unused 01 shift reg0, value int mode value | | | 5 3 8 bits int/wait mode: 00 int number 01 wait (waits for any interrupt) 2-7 unused -------------------------------- Machine #3 64-bit RISC computer with 32-bit instructions 32 general purpose registers: r0 - r31 r30 is sp r31 is pc sr register with flags: Z C N V P T I - There is no link register (it's rather trivial to change this) - 64-bit addressing, but the pc is always aligned on 4 bytes. - A word is now 32 bits wide. I like to use these terms: .b byte .w wyde .l "long" .q quad-wyde - Every instruction is 32 bits long. - An instruction with different operands is a different opcode; there are no addressing mode bits (other than the size bits). [This leaves more room for future extensions.] - The opcode takes 6 bits. (Currently, 5 bits would also fit. But I like having the prospect of future extensions) - size (.b, .w, .l, .q) takes 2 bits. - A register number takes 5 bits. - An immediate value is 16 bits or 8 bits for some instructions. - A condition (for branching) takes 4 bits. Instructions ============ Load/store: ldr reg, reg, reg ldr reg, reg, #imm8 str reg, reg, reg str reg, reg, #imm8 imm8 is a signed post-increment to index reg2 * size (max. range +/- 127*4). If you need larger increments, use a register. You can do a pre-increment by appending a '!' sign to reg2: ldr reg, reg!, reg ldr reg, reg!, #imm8 str reg, reg!, reg str reg, reg!, #imm8 Loading an address: lea reg, #offset This special instruction maximizes the range of the immediate value. Hence offset is a signed 23-bit pc-relative value: pc +/- 4 MB. Register setting: mov reg, reg mov reg, #imm16 mov.l and mov.q perform zero-extend. Use mov.b, mov.w if you don't want to zero-extend. Load top: movt reg, #imm16 * for swapping halves, use bit rotation Sign extending: movsx reg, reg movsx reg, #imm16 So, you can load a 64-bit signed value of -1 by using movsx reg, #-1 even though the immediate is only 16 bits wide. Arithmetics and bitwise ops: add, sub, mul, div, umul, udiv and, or, xor These have three-way operand modi: instr reg, reg, reg instr reg, reg, #imm16 Comparison: cmp reg, reg cmp reg, #imm16 tst reg, reg tst reg, #imm16 Branching: jmp reg jmp #imm jcc reg jcc #imm jsr reg jsr #imm jsrcc reg jsrcc #imm Register jumps are one opcode that always includes a condition field. Immediate jumps use many opcodes to maximize the range of the jump. The range for an immediate branch is +/- 128 MB. The range for a conditional immediate branch is +/- 8 MB. If you want to jump more far, set up a register and use jmp with register direct addressing mode. * pc-relative jumps can be made with add/sub: add pc, pc, #20 However, these are short jumps; jmp/jsr have a greater range. * jsr automatically pushes the return address to the stack. Pop the pc to return: ldr pc, sp, #4 ; 'ret' assembler macro (There is no link register like on the ARM). Modifying status register: mrs reg, sr msr sr, reg msr sr, #imm16 * there are enough bits left for the immediate value to be 26 bits, if needed. Moreover, there could be multiple status registers like a supervisor sr and a user sr Atomic locking: tas reg, reg tas reg, #imm8 System calls and interrupts: sys #imm8 wait Load/store multiple: ldm r0, r0, r6 stm r0, r10, r15 stm sp!, r0, r6 ldm sp!, r0, r6 You can store a sequence of registers by specifying 'from' and 'to'. You can use the '!' sign to update the index register. When using update, the index register will be post-incremented. However, a store to sp will pre-decrement sp. So the stack works as expected. Moreover, a store to sp will store the registers backwards, so that a load also works as expected. The register numbering loops around, so you can also do: stm sp!, pc, r4 and have the pc come out first when loading. * This is a little different from ldm/stm on the ARM, more like PowerPC Encoding ======== o opcode z size .b, .w, .l, .q x destination register y source register a third operand register T instruction select sub category p pre-indexing (ldr/str) or update (ldm/stm) i immediate value or address offset c condition (for branches) # reserved for future extension u unused bit generic with 3 registers: 000000TT TTTxxxxx zzpyyyyy uuuaaaaa generic with 2 registers + imm8: 000001TT TTTxxxxx zzpyyyyy iiiiiiii generic with 2 registers + imm16: (use many different opcodes) ooooooyy yyyxxxxx iiiiiiii iiiiiiii with 2 registers: oooooo#T TTTxxxxx zzuyyyyy 00000000 (low byte zeroes or unused) with 1 register + imm16: oooooo## TTTxxxxx iiiiiiii iiiiiiii with 1 register + imm8: oooooo## TTTxxxxx uuuuuuuu iiiiiiii single register: oooooo## TTTxxxxx uuuuuuuu uuuuuuuu single immediate value: oooooo## TTTiiiii iiiiiiii iiiiiiii no operands: oooooo## TTTuuuuu uuuuuuuu uuuuuuuu Other instructions: lea: ooooooii iiixxxxx iiiiiiii iiiiiiii branching to pc-relative address: ooooooii iiiiiiii iiiiiiii iiiiiiii conditional branching to register address: oooooocc ccuxxxxx uuuuuuuu uuuuuuuu conditional branching to pc-relative address: oooooocc cciiiiii iiiiiiii iiiiiiii tas: single reg with imm16 sys (system call): single immediate value mrs, msr with register: single register msr with imm16: single immediate value Opcode numbering ================ 6 bits opcodes -> 64 opcodes but it uses categories of addressing modes to get much more out of it. 00 three-way instruction register direct 00 ldr 01 str 02 add 03 sub 04 adc 05 sbc 06 mul 07 div 08 umul 09 udiv 0a and 0b or 0c xor 0d lsl 0e lsr 0f asr 10 ror 11 rrx 12 ldm 13 stm .. undefined 01 two registers + imm8 00 ldr 01 str 02 add 03 sub 04 adc 05 sbc 06 mul 07 div 08 umul 09 udiv 0a and 0b or 0c xor 0d lsl 0e lsr 0f asr 10 ror 11 rrx .. undefined 02 two registers 00 mov 01 movt 02 movsx 03 cmp 04 tst 05 set 06 clr 07 bit 08 tas .. undefined 03 one register + imm16 00 mov 01 movt 02 movsx 03 cmp 04 tst 05 undefined .. undefined 04 one register + imm8 00 mov 01 movt 02 movsx 03 cmp 04 tst 05 set 06 clr 07 bit 08 tas .. undefined 05 single register 00 msr 01 mrs 06 single immediate 00 msr 01 sys 07 no operands 00 wait two registers + imm16 08 ldr with imm16 09 str with imm16 0a add imm16 0b sub imm16 0c adc imm16 0d sbc imm16 0e mul imm16 0f div imm16 10 umul imm16 11 udiv imm16 12 and imm16 13 or imm16 14 xor imm16 .. reserved block 30 lea 31 jmp register direct with condition 32 jsr register direct with condition 33 jmp immediate, pc-relative 34 jsr immediate, pc-relative 35 jmp immediate, pc-relative with condition 36 jsr immediate, pc-relative with condition .. undefined ... and reserved # EOB