R800
This page was last modified 11:34, 23 November 2021 by Gdx. Based on work by Mars2000you and Grauw and others.
R800
R800

Contents

Description

The R800 (ASCII DAR800-X0G) is the central processing unit used by the MSX Turbo R home computer in Turbo ROM/RAM mode.

The R800 was designed by ASCII-Mitsui-Semiconductor & Co., Ltd. of Japan, with the aim of creating a faster CPU, while maintaining backwards compatibility with old MSX Zilog Z80-based hardware and software.

Working frequency is 7.159 Mhz. (The oscillator used is a 28.63630 MHz, but an internal circuit divides this frequency by 4.)

Comparison with Z80

The internal circuit structure of R800 arithmetic unit is on 16 bits instead of just 4 bits as traditionally used by 8080 / Z80 since the 4004. This change allows instructions that were being executed in 4 clocks to be done in 1 clock.

Memory access does not change. It remains on 16 bits.

Opcodes and flags

In order to preserve software compatibility with old MSX software, the R800 uses a superset of the Z80 instruction set. In addition to all the official Z80 opcodes, two multiplication instruction were added, MULUB (8-bit), and MULUW (16-bit). Also, many of the undocumented Z80 instructions were made official, these include all the opcodes dealing with IX and IY as 8-bit registers (IXh, IXl, IYh, IYl). For more detail, see R800 Programming.

As the R800 is not based directly on the Z80, but stems from the Z800 family, it lacks some of the other undocumented Z80 features. For instance, the undocumented flags represented in bits 3 and 5 of the F register don't assume the same values as in Z80 (causing it to fail ZEXALL tests) and the undocumented opcode often called SLL was changed to behave like a real logical left shift (identical to SLA).

Speed

On the hardware side, radical changes were made. The internal 4-bit ALU of the Z80 was replaced with a new 16-bit ALU. Opcodes like ADD HL,BC (which previously took 11 clock cycles) can run as fast as only one cycle in some conditions. The CPU clock speed is 7.15909MHz and the instructions take only between 1 and 7 cycles. That makes the CPU much faster than a Z80 at 3.579545MHz. The data bus remained 8-bit to maintain compatibility with old hardware.

Cycles and waitstates

Additional changes were made in the way the CPU fetches opcodes. The original Z80 uses 4 cycles to fetch a simple instruction like OR A, and an additional waitstate is issued on the MSX architecture. A review of the Z80 fetch mechanism in a typical MSX environment is needed to understand the R800:

  • Z80, cycle 1: set the higher 8-bits of address
  • Z80, cycle 2: set the lower 8-bits of address
  • Z80, cycle 3: waitstate
  • Z80, cycle 4: refresh, part 1
  • Z80, cycle 5: refresh, part 2

Since most implementations of MSX use RAM memory disposed in a 256×256 bytes block, two cycles are required to set the address for the fetch. The R800 avoids this by remembering the last known state of the higher 8-bits. If the next instruction is in the same 256-byte boundaries, the higher 8-bits are not set, and a cycle is saved. However, on the Z80, the refresh cycles destroy the information on the higher bits, so a workaround was needed.

The solution used in the R800 was to refresh entire blocks of RAM, instead of refreshing one line of RAM on each instruction issued. Each 30μs, the CPU is halted for 4μs, this time is used to refresh a block of the RAM. Since there's no refresh in between fetch instructions, and the waitstate is removed due to faster RAM chips, simple instructions can be issued using only one cycle. This cycle would be cycle 2 in the Z80 example above; cycle 1 becomes optional, and it's only issued when the program crosses a 256-byte boundary.

Important note

All of this only applies to the fast RAM used on the MSX Turbo R. External hardware, connected through cartridge slots, use timings similar to Z80. Not even the internal ROM of Turbo R is fast enough for this fetch scheme, so there is additional logic in the Turbo R to mirror the contents of ROM into RAM, in order to make it run faster.