|
| | Hay 36 invitados y 0 miembros en línea
Eres un usuario anónimo.
|
| |
Autor
| Accessing tables with Z80 ASM
| tokumaru msx lover Mensajes: 83 | Publicado: Julio 10 2006, 04:56   | Hi again guys.
I'm still learning Z80 ASM, so I've been looking at some sample code around here. The thing is I usually use a lot of look-up tables in my programs, and it seems that the Z80 lacks an efficient way to load information from them.
All codes I've seen that use tables have loaded the index of the element into hl and then added the base address of the table. That seems very slow. in 6502 ASM, I'm used to loading values from tables using 8-bit index registers, and the 16-bit base address goes along with the load intruction (or can also be in RAM, wich is usefull for big tables). It's simple and fast, 'cause there is no add, just load the index and load the value from the table.
Now, is that really true that you must perform an addition (worse, a 16-bit addition) to find the location of an element in a table? I can only imagine how bad it gets when you are working with multiple-byte values, where adding the base address to the element index is not enough, you'd also have to multiply the index of the element by the number of bytes of each element. Seems dead slow.
Doesn't the Z80 have a couple of index registers? I didn't find much info on how they work, and no one seems to use them. I'm assuming they aren't of much use, as the gameboy CPU is a modified Z80 that doesn't have them.
Any thoughts on this?
| | DamageX msx freak Mensajes: 168 | Publicado: Julio 10 2006, 06:51   | I had the same feeling when I was translating my music player routine from 6502 asm to Z80 asm. The best idea I could come up with was to put the base address in IX or IY, add the index, and then access other tables immediately following the first one by using the 8-bit offset.
So this:
ldx #1
lda status,x
sta oldstatus,x
could become:
ld ix,status
ld bc,$0001
add ix,bc
ld a,(ix+0)
ld (ix+30),a
It looks like the ix and iy registers can be more useful if you use undocumented Z80 opcodes but I'm not sure which ones are safe to use or which ones could cause a problem (like on R800 maybe). I'm thinking that for speed critical code you might even have tables aligned at 256-byte boundaries and then load ixh with the base address and ixl with the index.
| | ARTRAG msx master Mensajes: 1802 | Publicado: Julio 10 2006, 08:55   | if you use 256byte alligned tables you can use HL and the access costs only one load instruction (in L)
ld h, table_address/256
ld l,element_number
ld a,(hl)
etc.
ld l,new_element_number
ld a,(hl)
etc.
if you have two tables and objects in fix postitions (as tipically happens in records)
you can use IX and IY like this:
ld ix, record_1
ld iy, record_2
ld a,(IX+field1)
ld (IY+filed1),a
if you have to access many fields in a record and your code is in ram you can use this SMC
ld ix, record
ld a,offset
ld (incode-1),a ; it is supposed to point to the offset in the ld a,(ix+0) instruction
ld a,(ix+0)
incode: etc.
| | jltursan msx professional Mensajes: 887 | Publicado: Julio 10 2006, 09:45   | And keep in mind that the 6502 CPU is memory access oriented while the Z80 CPU is a register oriented CPU; so lookup tables are specifically a lot faster under the first one. The Z80 index registers IX and IY are painfully slow.
You can find a lot of discussion that involves lookup tables (and other tricks) in this thread:
speeding up assembly routine | | tokumaru msx lover Mensajes: 83 | Publicado: Julio 10 2006, 14:35   | Thanks for the replies guys. I liked your ideas, and I think that trying to keep the tables aligned to 256 byte pages and using only l as an index is the best choice in many cases.
For cases when you only have to load one or two values, I wouldn't mind having to add, but there are other complicated things (like level maps that are not linear - I use a lot of those) that could become very slow.
About IX and IY, does anyone actually use those in games? In the past few week I've looked at a few source codes and I've never seen those beeing used.
| | ARTRAG msx master Mensajes: 1802 | Publicado: Julio 10 2006, 15:08   | If you use arrays of complex structs for representing enemis or objects, they can be quite handy.
LD B, NUMBER_OF_OBJECTS
LD IX,FIRST_RECORD
LOOP:
LD A,(IX+FIELD1)
ETC
LD A,(IX+FIELD2)
ETC
; DO ALL THE ACTIONS YOU NEED USING HL,DE,C,A
LD DE,RECORDSIZE
ADD IX,DE
DJNZ LOOP
When you need to work on two list of records
you can follow the same template using IY too as
pointer to the second list
| | tokumaru msx lover Mensajes: 83 | Publicado: Julio 10 2006, 15:33   | You are right. I can see it beeing usefull for handling enemies in a list. The "field" part could be used to access each property of the enemy/object, such as it's coordinates, speed, health, etc. You could do all the work in an enemy/object using the same index register (only changing it for the next object), while you'd have to keep messing with hl to access each property if you were to do it that way.
On the other hand, if IX and IY are slow as jltursan said, you might be better off using hl and incrementing l to access each field. I wouldn't know about the speed yet, gotta check it up.
| | Sonic_aka_T
 msx guru Mensajes: 2269 | Publicado: Julio 10 2006, 15:35   | Quote:
| About IX and IY, does anyone actually use those in games? In the past few week I've looked at a few source codes and I've never seen those beeing used.
|
Hehe, try looking at Microcabin games! It's really quite amazing these games are as fast as they are! Heck, these ppl (or their compiler, probably) would've used IX/IY as an accumulator if they could! Having said that, there are occasions in which IX/IY can be handy, and even faster than using HL/DE. | | Sonic_aka_T
 msx guru Mensajes: 2269 | Publicado: Julio 10 2006, 15:37   | Quote:
| It looks like the ix and iy registers can be more useful if you use undocumented Z80 opcodes but I'm not sure which ones are safe to use or which ones could cause a problem (like on R800 maybe).
|
These are all official instructions on the R800. Apart from the undocumented flags the R800 'emulates' the Z80 perfectly. Only instruction I'm not sure of is IN F,(C) or whatever it was... | | jltursan msx professional Mensajes: 887 | Publicado: Julio 10 2006, 15:47   | Since now I've been using IX and IY very commonly; but just because speed doesn't matters and I'm a bit lazy. They're usually more handy to use than plan a better table structure. Most of the time, design a good field order in registers of your table could save you to use index registers, if you're able to process the data in every register sequentially you'll save a lot of cycles by using only HL/DE and BC common registers.
| | pitpan msx master Mensajes: 1418 | Publicado: Julio 10 2006, 17:12   | IIRC, the undocumented Z80 instruction SLL x is not supported by the R800 CPU, that executes it as a kind of bit-test instruction. Therefore: be careful. In all my MSX(1) games, first of all I detect if it is being executed in a Turbo-R computer. If so, then it switches to Z80 CPU [SETCPU].
| | Edwin msx professional Mensajes: 635 | Publicado: Julio 10 2006, 18:49   | Quote:
| About IX and IY, does anyone actually use those in games? In the past few week I've looked at a few source codes and I've never seen those beeing used.
|
I used them frequently in U:U as well. I think any more complex project will benefit from them. In some places you just need the extra registers. I do tend to keep them out of loops though. I actually use the 8 bit versions (undocumented instructions) often as storage or counters when I'm out of other registers. | | ARTRAG msx master Mensajes: 1802 | Publicado: Julio 10 2006, 19:43   | Actually many C compilers use IX or IY to manage the heap during function calls.
the heap is used both for local variables and for parameter exchange, so the
code of the function accesses to the local variables by many LD A,(IX+offest)
Also automatic variables are stored on the heap and accessed by LD A,(IX+offest)
| | PingPong msx master Mensajes: 1069 | Publicado: Julio 10 2006, 20:17   | Quote:
| Actually many C compilers use IX or IY to manage the heap during function calls.
the heap is used both for local variables and for parameter exchange, so the
code of the function accesses to the local variables by many LD A,(IX+offest)
Also automatic variables are stored on the heap and accessed by LD A,(IX+offest)
|
@tokumaru: you need to switch your coding style when passing from 6502 assembly to z80 assembly.
If you try to code z80 with 6502 style in mind, this may work, but you get poor performances.
Do not think that IX,IY register are useless, as said by ARTRAG they are quite useful to quickly address some fixed structures like this (in 'C')
struct mystruct
{
BYTE data;
int offset;
char* p;
};
these can be quickly 'mapped' in : (for example)
ld ix, mystructaddr
ld l,(ix+1)
ld h,(ix+2) ; hl = value of offset
You are right about post or pre addressing feature of 6502 (that is a true gem), but remember, that can only work in a 256 byte boundary and when crossing the boundary you come back to start (and get an extra cycle delay for that boundary crossing). So you can use lookup tables only in page 0 (first 256 bytes), since the 6502 lacks a way to specify the base address.
Instead with z80 you can use H as high order address and l as low order byte. This allow you to virtually relocate your lookup table anywhere in (64K). Paying some extra you can also get a lookup table > than 256 bytes.
Regarding speed: I do not know anything on the platform where you had programmed 6502, but remember this :
C64 had 6510 (6502 derivative) inside,
MSX/CPC/ZX Spectrum had z80.
Typically sw developed on z80 outperform the 6502 in all cases where a computational power was requested (all 3d isometric games/chess games were quicker on zx than on c64, generally).
If you try to implement multiplication via lookup table on z80 you do not get the best results. It is faster using some public domain routine available on "net".
The main important thing is to made a "switch" in thinking a algorithm. As a general rule when coding i can give you these hints:
When you choose a register to hold a frequently used value take in mind these rules:
Do not choose 'A'. It is the register that have the greater chance to be destroyed, because several istructions works only using it ( for example addition. You cannot add b to c having the result in b, one of the register should be 'A')
When doing memory access use First DE then BC then HL for the same reasons, if you run out of registers consider using also IX,IY or swithing register banks. Almost in all cases all tecniques works quicker or at least at the same speed of doing memory accesses in 6502 fashion.
The main problem for you that come from 6202 assembly is that the z80asm is not ortogonal as almost 6502 is.
You will find some difficulties initially because of this.
| | tokumaru msx lover Mensajes: 83 | Publicado: Julio 10 2006, 22:30   | Quote:
| @tokumaru: you need to switch your coding style when passing from 6502 assembly to z80 assembly.
If you try to code z80 with 6502 style in mind, this may work, but you get poor performances.
|
Yeah, I know that. I do want to be able to output Z80 code fluently, as I can with 6502. In fact, this was one of my concerns when I started learning the Z80, I was afraid of forgeting the 6502 way of things. Hope I can manage both! =)
Quote:
| Do not think that IX,IY register are useless, as said by ARTRAG they are quite useful to quickly address some fixed structures like this (in 'C')
|
I was just arguing the "quickly" part, as the instructions that make use of IX and IY are quite slow.
Quote:
| You are right about post or pre addressing feature of 6502 (that is a true gem), but remember, that can only work in a 256 byte boundary and when crossing the boundary you come back to start (and get an extra cycle delay for that boundary crossing). So you can use lookup tables only in page 0 (first 256 bytes), since the 6502 lacks a way to specify the base address.
|
I can see advantages when it comes to indexed addressing in both processors, but I can also see flaws in both of them. Neither is perfect.
Quote:
| Instead with z80 you can use H as high order address and l as low order byte. This allow you to virtually relocate your lookup table anywhere in (64K). Paying some extra you can also get a lookup table > than 256 bytes.
|
I just regret that when using hl to access tables you almost always have to perform an addition. The way you mentioned will only work when loading 1-byte values. For more than that you'll either have to INC L to access each byte (worse, since you'd have to multiply the original index by the size of each entry to find the first byte of the value) or you could split the table in multiple ones, containing all the first bytes, then the second bytes, etc and INC H to move from one table to the other. Either way you have to do some math.
Quote:
| Regarding speed: I do not know anything on the platform where you had programmed 6502, but remember this :
C64 had 6510 (6502 derivative) inside,
MSX/CPC/ZX Spectrum had z80.
Typically sw developed on z80 outperform the 6502 in all cases where a computational power was requested (all 3d isometric games/chess games were quicker on zx than on c64, generally).
|
I code for the NES (Nintendo Entertainment System). The clock of it's processor is about half that of the MSX's Z80. That would compensate the fact that Z80 instructions use many more cycles. Also, the Z80 has much more complex instructions, while 6502 ASM is always about tiny little steps. The Z80 can have an instruction that copies this to that, decrements that and increments this and such complex instructions, the 6502 could never do that.
So, yeah, I guess that the Z80 can do more than the 6502, I'm not arguing that. I just miss some features I liked! =)
Quote:
| If you try to implement multiplication via lookup table on z80 you do not get the best results. It is faster using some public domain routine available on "net".
|
I like to build my own routines and understand 100% of what I'm using in my code. I don't feel comfortable using other people's code. I'm a control freak! O.o
Quote:
| The main important thing is to made a "switch" in thinking a algorithm.
|
I agree with you. I'm training to be able to do just that! Thank you for the tips. It is nice how with the Z80 you can do pretty complex tasks using only the registers and not touching the memory at all. The simplest tasks in 6502 require you to use memory as a temporary medium. Not that I'm complaining, I like both CPU's. | |
| |
| |
| |