Correct usage of VDP ports following the standard

Page 5/18
1 | 2 | 3 | 4 | | 6 | 7 | 8 | 9 | 10

By ARTRAG

Enlighted (6933)

ARTRAG's picture

02-06-2017, 23:21

I am evaluating a possible support to MA-20 in my next msx1 games. At boot time I could read the port at 6h and print a message in case it is not 99h. The message could be something like "This is an msx1 game, please unplug that MA-20 extension and reboot." Wink

By Grauw

Ascended (10711)

Grauw's picture

03-06-2017, 00:07

Maybe a nice example of the reverse; the CX5M-II is an MSX1 with the V9938 VDP, in Synthesix I went out of my way to support its 80-column text mode. Similarly e.g. screen 4 games could do so as well. You do have to bypass the BIOS for this Big smile.

By DarkSchneider

Paladin (989)

DarkSchneider's picture

04-06-2017, 10:04

Louthrax wrote:

Official way for me Smile (patching the code in RAM with the ports value read from BIOS for speed sake)!

Interesting, just like patching the PSG.

hit9918 wrote:

is that topic so unsolved?
one cant line interrupt with the bios... but can do it with port retargeting.

OK then we do magic or voodoo.

hit9918 wrote:

yeah bios you can forget in a game. except in a little game.

Sure, you tried and cofirmed?

PingPong wrote:

there is also another thing that stop the use of the BIOS when doing time critical operations: the fact that the bios implements functions that are 'trusted' to do a task (I/O, set graphic mode, set a point, draw a line, set psg etc) does not mean that the BIOS IS THE SAME on every machine, implementation could vary.
plus the interrupt handler of the BIOS is 'sensitive' to hw configuration (floppy drive or hw that do interrupts).
the BIOS is fundamentally not only slow, but also timing-unreliable.

say we need to write a charater on screen at a specific time: the BIOS guarantees that the charater will be printed but say nothing about the time it spend to achieve the result. If you talk to bare metal directly you can have a control even on this aspect, (or more precisely you can have better control on timing).

Again, it is NOT slow. Sometimes I think BIOS is confused with BASIC. At the end they are machine code small functions with an extra JP, there is no much difference.
I tested with how many function calls could be made in a full interrupt time (a complete screen scan), so get the overhead. The results was the bios achieved 177 and direct code 199. There is little difference, and unless your code only calls to it, much less difference.

And about the HW configuration dependant, you also have it for direct code. Unless you only target ROMs, you can't override the 38H routine, or the system will crash or freeze. And we can't limit ourselves to only making ROMs for 64KB RAM machines (to override the 38h in RAM at page 0).
Think also that if trying to fit so exactly in a hw, what about if it has a 6MHz CPU? or any other change? Is that predicted?

I repeat, MSX is not a videogame console, a closed system where the manufacturer specifies everything because it is its own system, and can.

Then, instead that exact fit dependency, in computers with variable hardware configurations is the flexible way the one adpoted.
Why you need to write the character at exact time? Maybe because writing on-the-fly to VRAM SAT and expect to be at time? Well, that could be fine on videogame console, but for computer a better solution is to work on double buffer and swap when characters are ready. There is also a better solution, but I think is off-topic.

I've noticed that the real speed is on the other code, the logic itself and how do you access to machine (in the side of way, not microcode itself), not getting some few cycles in an operation that you will probably do 20-30 times in a loop as much.

Most of the worries here about it I start to think that is because a wrong idea of BIOS. And moreover, the topic here is to put a RAM read and using the IN/OUT (C) operation, that is even much lighter. Surprised about it seems the idea of if you write a 99h instead it will run at double speed or something.

By Jipe

Paragon (1604)

Jipe's picture

04-06-2017, 10:24

Quote:

Maybe a nice example of the reverse; the CX5M-II is an MSX1 with the V9938 VDP

the CX5MII is a MSX2 with MSX1 ROM , is a CX7 Inside

By Creepy

Champion (335)

Creepy's picture

04-06-2017, 10:46

And here I am thinking that I read somewhere on this forum that Space Manbow was actually following the MSX standard. Especially with the FRS patch: http://frs.badcoffee.info/patches.html.

By Louthrax

Prophet (2436)

Louthrax's picture

04-06-2017, 12:11

For those interested, here's how I do the "ports patch" for my VDP-related code (of course, that works only for code in RAM). This method is quite handy (you just need to prefix all VDP access with a RVDP or WVDP macro), and allows full-speed VDP access in the standard way with not too much hassle.

1. All my VDP access are preceded by a RVDP or WVDP macro:

VDP_doCommand:
	call	VDP_wait

	ld 	a,32
	di
	WVDP
	out	(1),a
	ld	a,0x91

	WVDP
	out	(1),a
	ld	hl,VDP_SX

	WVDP
	ld	bc,0x0F03
	otir
	ei
	ret

The assembly code here is using "0-based" VDP ports (0, 1, 2, 3 etc...). Note that this way, even the "OTIR" instruction works (VDP port 3 in this case).

2. The WVDP and RVDP macros are defined this way:

WVDP	MACRO
wvdpa	defl $+1
	rseg wvdp
	defw	wvdpa
	rseg code
	ENDM

RVDP	MACRO
rvdpa	defl $+1
	rseg rvdp
	defw	rvdpa
	rseg code
	ENDM

This is creating and populating two "segments", rvdp and wvdp, which contain all addresses in RAM that need to be patched.

3. At application start, I just need to call this function to fix the VDP read and write ports:

wlen	equ 	SIZEOF wvdp
rlen	equ 	SIZEOF rvdp

VDP_init:
	push	ix
	push	de
	push	bc

	ld	hl,7
	ld	ix,.SFB.(wvdp)
	ld	a,wlen/2
	or	a
	call	nz,patch

	ld	hl,6
	ld	ix,.SFB.(rvdp)
	ld	a,rlen/2
	or	a
	call	nz,patch

	pop	bc
	pop	de
	pop	ix
	ret

;***************************************************************************************

patch:
	ld	b,a
	ld	a,(EXPTBL)
	push	bc
	call 	RDSLT
	pop	bc
	ld	c,a

b1:	ld	l,(ix+0)
	inc	ix
	ld	h,(ix+0)
	inc	ix
	ld	a,(hl)
	add	a,c
	ld	(hl),a
	djnz	b1
	ret

The code here is using IAR C assembler, but it also works (with modification of course) on the Hitech-C assembler. You assembler just needs to have a "segments" notion.

By Grauw

Ascended (10711)

Grauw's picture

04-06-2017, 13:46

DarkSchneider wrote:

I tested with how many function calls could be made in a full interrupt time (a complete screen scan), so get the overhead. The results was the bios achieved 177 and direct code 199. There is little difference, and unless your code only calls to it, much less difference.

That’s so few! You don’t exactly describe what you’ve tested so I’m assuming you were talking about VDP register writes. But with a screen of 212 lines, clearly can’t make a screensplit with that. At 60 Hz you’ve got 59659 cycles per frame at your disposal, that means that routine takes 300 cycles per run, that’s quite hefty (for reference, the VDP draws a line in 228 cycles). Setting a VDP register should take around 40-100 cycles, depending on whether you inline or not, di/ei, update the mirror values and/or pull the I/O address from BIOS.

DarkSchneider wrote:

Most of the worries here about it I start to think that is because a wrong idea of BIOS. And moreover, the topic here is to put a RAM read and using the IN/OUT (C) operation, that is even much lighter. Surprised about it seems the idea of if you write a 99h instead it will run at double speed or something.

Definitely it does not make that kind of a difference. For VRAM output, it will make almost 0 difference because the most time will be spent in the LDIR and it already puts the register in C. For register output it will make a difference of let’s say 20%, usually this is not a problem.

However for screen splits it can mean the difference between having an invisible split in the border region vs. one that spills into the visible area, or the split spanning over multiple lines, or certain per-line splits not being possible (e.g. x + y scroll = 3 register updates per line). It can also make the difference between a split working regardless of CPU clock speed, or it relying too much on CPU timing so it becomes visible at different clock speeds.

Louthrax’s example is the best of both worlds, however it obviously requires some effort on the programmer’s side, but something like that is how I would do it.

By Grauw

Ascended (10711)

Grauw's picture

04-06-2017, 13:01

A topic not touched on here, but you should also update VDP register mirror values in RAM, and update the palette mirror in VRAM as well. But ouf, now things are getting quite expensive if we want to do things fully according to the book Smile.

Another thing, consider DOS applications, accessing the BIOS must be done using a very expensive interslot call (iirc, 300+ cycles overhead?). For things which would access the BIOS often, like VDP or PSG, it seems better to skip that or your graphics and music routines will take away a significant portion of your frame time.

By Louthrax

Prophet (2436)

Louthrax's picture

04-06-2017, 13:18

Grauw wrote:

A topic not touched on here, but you should also update VDP register mirror values in RAM, and update the palette mirror in VRAM as well.

So I'm usually not doing that, but I restore the VDP / palette settings when exiting my application (okay, there might be some issues if for example a background application wants to display things meanwhile).

Grauw wrote:

Another thing, consider DOS applications, accessing the BIOS must be done using a very expensive interslot call (iirc, 300+ cycles overhead?). For things which would access the BIOS often, like VDP or PSG, it seems better to skip that or your graphics and music routines will take away a significant portion of your frame time.

A workaround for DOS applications is to do the BIOS related things in the interrupt handler (the BIOS is accessible there). That works well for music, you can also poll the keyboard / joystick / mouse values at that occasion and store them somewhere for later use.

By DarkSchneider

Paladin (989)

DarkSchneider's picture

04-06-2017, 14:09

Grauw wrote:

A topic not touched on here, but you should also update VDP register mirror values in RAM, and update the palette mirror in VRAM as well. But ouf, now things are getting quite expensive if we want to do things fully according to the book Smile.

Another thing, consider DOS applications, accessing the BIOS must be done using a very expensive interslot call (iirc, 300+ cycles overhead?). For things which would access the BIOS often, like VDP or PSG, it seems better to skip that or your graphics and music routines will take away a significant portion of your frame time.

Notice that. Good point I forgot to mention, updating the system variables. I put it (the LD (RGMIRROR), A) between the 2 OUTs so it is used too as WAIT. I read no remember where it is good to give time to the VDP. Probably not needed on MSX2 or greater, but if it must be done well I see no problem doing it there.

@Graw the operation was precisely LDIRVM, using the BIOS one and using the steps (but the setting expanded VRAM, I skip that one, supposing the VRAM will be always in non-expansion VRAM, bad for me Big smile ) indicated in the V9938 manual.
For writing the registers using OUT(99h) or OUT(C) I truly think there is no practical difference.

About inter-slot, that's true, here is the way:
X BAD:
loop {
inter-slot call
}

V FINE:
set bios at page 0
loop {
direct calls
}
set RAM at page 0

Single calls are not problem, inside loops must be more careful. I also measured the inter-slot call overhead and is about double, I mean, you can do half calls in the same time than with direct calls for mid-complex functions (the call itself, not the work done inside that has no extra cost).
Then, for PSG are really not much, only settings the registers for 3 channels for the current playing sounds (are not so much). The VDP is not that so much as explained before. And remember that during ISR we already have the BIOS at page 0, no need to set it.

@Louthrax I read the input from RAM directly, they are in the system area, the BIOS updates it, so no need to call again the input functions but the joystick one (STICK, the TRIG are also in RAM), that only needs to be called once per loop iteration.

Page 5/18
1 | 2 | 3 | 4 | | 6 | 7 | 8 | 9 | 10