Fastest possible multiplication routine?

Página 1/7
| 2 | 3 | 4 | 5 | 6

Por Prodatron

Paragon (1843)

imagem de Prodatron

05-05-2007, 01:34

Hi,

today I stumbled over a routine I developed for my old 3D demos and which I uploaded to the CPCWiki some time ago:

http://www.cpcwiki.com/index.php/Programming:Ultrafast_Multiplication

As this is 100% Z80 and not platform related it maybe interesting for you.
AFAIK it's the fastest way of multiplication which is possible on Z80 for this precision (16bit x 8/7bit).
I wonder, if it could be even done faster, and how much faster the R800 does with its inbuilt MUL-commands?

CU,
Prodatron

Entrar ou registrar-se para comentar

Por dvik

Prophet (2200)

imagem de dvik

05-05-2007, 02:08

The code is a bit hard to read. How ultra fast is it?

Por Prodatron

Paragon (1843)

imagem de Prodatron

05-05-2007, 02:18

Sometimes I like to use ":" and put multiple commands in one line Smile
The whole thing uses (more or less) ONE routine for EACH rigth-side value, that's it. The speed depends on the rigth-side value (8/7 bit) as different routines are selected then.

Por dvik

Prophet (2200)

imagem de dvik

05-05-2007, 02:20

Do you have some cpu cycle numbers? worst case/best case/ average...

Actually I guess I can do that myself. Its not that hard...

Por dvik

Prophet (2200)

imagem de dvik

05-05-2007, 02:33

By doing shifts on DE (or BC) and accumulate on HL, you'll be faster in some cases

MUL073 for example can be implemented like this (input args may need to be modified to fit your code):

MUL073:
        ld      h,d
        ld      l,e
        sla     e
        rl      d
        sla     e
        rl      d
        add     hl,de
        sla     e
        rl      d
        sla     e
        rl      d
        add     hl,de 
        ld      a,h
        ret

Saves 26 cycles if I did the math correct.

Por dvik

Prophet (2200)

imagem de dvik

05-05-2007, 02:46

Also, why do you use:

        PUSH BC
        RET

isntead of

        JP (HL)

Por dvik

Prophet (2200)

imagem de dvik

05-05-2007, 02:52

I did a similar routine not too long ago. Don't have it on this computer but the I think I did the table lookup like this:

Mul:
        ld      h,MulTab
        add     hl,hl
        ld      b,(hl)
        inc     l
        ld      h,(hl)
        ld      h,b
        jp      (hl)

Saves another 20 cycles

Por nikodr

Paladin (750)

imagem de nikodr

05-05-2007, 04:35

Anyone with a real turbo-r to tell us about the r800 mul routines?Actually i am interested to buy a turbo-r by the way.How much would a good price be? 200-250 euros?

Por nikodr

Paladin (750)

imagem de nikodr

05-05-2007, 04:35

Sorry for double post Sad

Por dvik

Prophet (2200)

imagem de dvik

05-05-2007, 07:00

I think some of the cases can be optimized quite a bit. here are some examples.
These particular cases are maybe 10-30% faster. But I think most of the cases in your routine are close to optimal. But if say 20% of the cases can be made significantly faster, I think it will be noticeable on the overall performance.

MUL007:
        ld      d,h
        ld      e,l
        add     hl,hl
        add     hl,de
        add     hl,hl
        add     hl,de
        ld      a,h
        ret

MUL011:
        ld      d,h
        ld      e,l
        add     hl,hl
        add     hl,hl
        add     hl,de
        add     hl,hl
        add     hl,de
        ld      a,h
        ret

MUL063:
        ld      d,h
        ld      e,l
        rr      h
        rr      l
        rr      h
        rr      l
        xor     a
        ld      h,l
        ld      l,a
        sbc     hl,de
        ld      a,h
        ret

MUL064:
        rr      h
        rr      l
        rr      h
        ld      a,l
        rrca
        ret

MUL065:
        ld      d,h
        ld      e,l
        rr      h
        rr      l
        rr      h
        rr      l
        ld      h,l
        ld      l,0
        add     hl,de
        ld      a,h
        ret

MUL073:
        ld      d,h
        ld      e,l
        sla     e
        rl      d
        sla     e
        rl      d
        add     hl,de
        sla     e
        rl      d
        sla     e
        rl      d
        add     hl,de 
        ld      a,h
        ret

MUL127:
        ld      d,h
        ld      e,l
        rr      h
        rr      l
        xor     a
        ld      h,l
        ld      l,a
        sbc     hl,de
        ld      a,h
        ret

Por dvik

Prophet (2200)

imagem de dvik

05-05-2007, 08:08

Some of the versions I put here may be wrong. I'll try to find the real implementation and post it here

Página 1/7
| 2 | 3 | 4 | 5 | 6