I am evaluating a possible support to MA-20 in my next msx1 games. At boot time I could read the port at 6h and print a message in case it is not 99h. The message could be something like "This is an msx1 game, please unplug that MA-20 extension and reboot."
Maybe a nice example of the reverse; the CX5M-II is an MSX1 with the V9938 VDP, in Synthesix I went out of my way to support its 80-column text mode. Similarly e.g. screen 4 games could do so as well. You do have to bypass the BIOS for this .
Official way for me (patching the code in RAM with the ports value read from BIOS for speed sake)!
Interesting, just like patching the PSG.
is that topic so unsolved?
one cant line interrupt with the bios... but can do it with port retargeting.
OK then we do magic or voodoo.
yeah bios you can forget in a game. except in a little game.
Sure, you tried and cofirmed?
there is also another thing that stop the use of the BIOS when doing time critical operations: the fact that the bios implements functions that are 'trusted' to do a task (I/O, set graphic mode, set a point, draw a line, set psg etc) does not mean that the BIOS IS THE SAME on every machine, implementation could vary.
plus the interrupt handler of the BIOS is 'sensitive' to hw configuration (floppy drive or hw that do interrupts).
the BIOS is fundamentally not only slow, but also timing-unreliable.
say we need to write a charater on screen at a specific time: the BIOS guarantees that the charater will be printed but say nothing about the time it spend to achieve the result. If you talk to bare metal directly you can have a control even on this aspect, (or more precisely you can have better control on timing).
Again, it is NOT slow. Sometimes I think BIOS is confused with BASIC. At the end they are machine code small functions with an extra JP, there is no much difference.
I tested with how many function calls could be made in a full interrupt time (a complete screen scan), so get the overhead. The results was the bios achieved 177 and direct code 199. There is little difference, and unless your code only calls to it, much less difference.
And about the HW configuration dependant, you also have it for direct code. Unless you only target ROMs, you can't override the 38H routine, or the system will crash or freeze. And we can't limit ourselves to only making ROMs for 64KB RAM machines (to override the 38h in RAM at page 0).
Think also that if trying to fit so exactly in a hw, what about if it has a 6MHz CPU? or any other change? Is that predicted?
I repeat, MSX is not a videogame console, a closed system where the manufacturer specifies everything because it is its own system, and can.
Then, instead that exact fit dependency, in computers with variable hardware configurations is the flexible way the one adpoted.
Why you need to write the character at exact time? Maybe because writing on-the-fly to VRAM SAT and expect to be at time? Well, that could be fine on videogame console, but for computer a better solution is to work on double buffer and swap when characters are ready. There is also a better solution, but I think is off-topic.
I've noticed that the real speed is on the other code, the logic itself and how do you access to machine (in the side of way, not microcode itself), not getting some few cycles in an operation that you will probably do 20-30 times in a loop as much.
Most of the worries here about it I start to think that is because a wrong idea of BIOS. And moreover, the topic here is to put a RAM read and using the IN/OUT (C) operation, that is even much lighter. Surprised about it seems the idea of if you write a 99h instead it will run at double speed or something.
Maybe a nice example of the reverse; the CX5M-II is an MSX1 with the V9938 VDP
the CX5MII is a MSX2 with MSX1 ROM , is a CX7 Inside
And here I am thinking that I read somewhere on this forum that Space Manbow was actually following the MSX standard. Especially with the FRS patch: http://frs.badcoffee.info/patches.html.
For those interested, here's how I do the "ports patch" for my VDP-related code (of course, that works only for code in RAM). This method is quite handy (you just need to prefix all VDP access with a RVDP or WVDP macro), and allows full-speed VDP access in the standard way with not too much hassle.
1. All my VDP access are preceded by a RVDP or WVDP macro:
VDP_doCommand: call VDP_wait ld a,32 di WVDP out (1),a ld a,0x91 WVDP out (1),a ld hl,VDP_SX WVDP ld bc,0x0F03 otir ei ret
The assembly code here is using "0-based" VDP ports (0, 1, 2, 3 etc...). Note that this way, even the "OTIR" instruction works (VDP port 3 in this case).
2. The WVDP and RVDP macros are defined this way:
WVDP MACRO wvdpa defl $+1 rseg wvdp defw wvdpa rseg code ENDM RVDP MACRO rvdpa defl $+1 rseg rvdp defw rvdpa rseg code ENDM
This is creating and populating two "segments", rvdp and wvdp, which contain all addresses in RAM that need to be patched.
3. At application start, I just need to call this function to fix the VDP read and write ports:
wlen equ SIZEOF wvdp rlen equ SIZEOF rvdp VDP_init: push ix push de push bc ld hl,7 ld ix,.SFB.(wvdp) ld a,wlen/2 or a call nz,patch ld hl,6 ld ix,.SFB.(rvdp) ld a,rlen/2 or a call nz,patch pop bc pop de pop ix ret ;*************************************************************************************** patch: ld b,a ld a,(EXPTBL) push bc call RDSLT pop bc ld c,a b1: ld l,(ix+0) inc ix ld h,(ix+0) inc ix ld a,(hl) add a,c ld (hl),a djnz b1 ret
The code here is using IAR C assembler, but it also works (with modification of course) on the Hitech-C assembler. You assembler just needs to have a "segments" notion.
I tested with how many function calls could be made in a full interrupt time (a complete screen scan), so get the overhead. The results was the bios achieved 177 and direct code 199. There is little difference, and unless your code only calls to it, much less difference.
That’s so few! You don’t exactly describe what you’ve tested so I’m assuming you were talking about VDP register writes. But with a screen of 212 lines, clearly can’t make a screensplit with that. At 60 Hz you’ve got 59659 cycles per frame at your disposal, that means that routine takes 300 cycles per run, that’s quite hefty (for reference, the VDP draws a line in 228 cycles). Setting a VDP register should take around 40-100 cycles, depending on whether you inline or not, di/ei, update the mirror values and/or pull the I/O address from BIOS.
Most of the worries here about it I start to think that is because a wrong idea of BIOS. And moreover, the topic here is to put a RAM read and using the IN/OUT (C) operation, that is even much lighter. Surprised about it seems the idea of if you write a 99h instead it will run at double speed or something.
Definitely it does not make that kind of a difference. For VRAM output, it will make almost 0 difference because the most time will be spent in the LDIR and it already puts the register in C. For register output it will make a difference of let’s say 20%, usually this is not a problem.
However for screen splits it can mean the difference between having an invisible split in the border region vs. one that spills into the visible area, or the split spanning over multiple lines, or certain per-line splits not being possible (e.g. x + y scroll = 3 register updates per line). It can also make the difference between a split working regardless of CPU clock speed, or it relying too much on CPU timing so it becomes visible at different clock speeds.
Louthrax’s example is the best of both worlds, however it obviously requires some effort on the programmer’s side, but something like that is how I would do it.
A topic not touched on here, but you should also update VDP register mirror values in RAM, and update the palette mirror in VRAM as well. But ouf, now things are getting quite expensive if we want to do things fully according to the book .
Another thing, consider DOS applications, accessing the BIOS must be done using a very expensive interslot call (iirc, 300+ cycles overhead?). For things which would access the BIOS often, like VDP or PSG, it seems better to skip that or your graphics and music routines will take away a significant portion of your frame time.
A topic not touched on here, but you should also update VDP register mirror values in RAM, and update the palette mirror in VRAM as well.
So I'm usually not doing that, but I restore the VDP / palette settings when exiting my application (okay, there might be some issues if for example a background application wants to display things meanwhile).
Another thing, consider DOS applications, accessing the BIOS must be done using a very expensive interslot call (iirc, 300+ cycles overhead?). For things which would access the BIOS often, like VDP or PSG, it seems better to skip that or your graphics and music routines will take away a significant portion of your frame time.
A workaround for DOS applications is to do the BIOS related things in the interrupt handler (the BIOS is accessible there). That works well for music, you can also poll the keyboard / joystick / mouse values at that occasion and store them somewhere for later use.
A topic not touched on here, but you should also update VDP register mirror values in RAM, and update the palette mirror in VRAM as well. But ouf, now things are getting quite expensive if we want to do things fully according to the book .
Another thing, consider DOS applications, accessing the BIOS must be done using a very expensive interslot call (iirc, 300+ cycles overhead?). For things which would access the BIOS often, like VDP or PSG, it seems better to skip that or your graphics and music routines will take away a significant portion of your frame time.
Notice that. Good point I forgot to mention, updating the system variables. I put it (the LD (RGMIRROR), A) between the 2 OUTs so it is used too as WAIT. I read no remember where it is good to give time to the VDP. Probably not needed on MSX2 or greater, but if it must be done well I see no problem doing it there.
@Graw the operation was precisely LDIRVM, using the BIOS one and using the steps (but the setting expanded VRAM, I skip that one, supposing the VRAM will be always in non-expansion VRAM, bad for me ) indicated in the V9938 manual.
For writing the registers using OUT(99h) or OUT(C) I truly think there is no practical difference.
About inter-slot, that's true, here is the way:
X BAD:
loop {
inter-slot call
}
V FINE:
set bios at page 0
loop {
direct calls
}
set RAM at page 0
Single calls are not problem, inside loops must be more careful. I also measured the inter-slot call overhead and is about double, I mean, you can do half calls in the same time than with direct calls for mid-complex functions (the call itself, not the work done inside that has no extra cost).
Then, for PSG are really not much, only settings the registers for 3 channels for the current playing sounds (are not so much). The VDP is not that so much as explained before. And remember that during ISR we already have the BIOS at page 0, no need to set it.
@Louthrax I read the input from RAM directly, they are in the system area, the BIOS updates it, so no need to call again the input functions but the joystick one (STICK, the TRIG are also in RAM), that only needs to be called once per loop iteration.