When you say “the same VRAM area” what means exactly? The exact VRAM address, a range, could be the same RAM row within the memory chips?
As for me, I don't really understand these results.
When you say “the same VRAM area” what means exactly? The exact VRAM address, a range, could be the same RAM row within the memory chips?
VRAM direct access tests are performed between addresses 02A8h and 03A7h (256 bytes).
Just before executing a test iteration, we run one of these commands:
- LMMV(sx=0, sy=0, nx=256, ny=256, col=0, op=OR),
- LMMV(sx=0, sy=256, nx=256, ny=256, col=0, op=OR),
- No VDP command.
In the 1st case, we observe write failures at certain access speeds (on real machines and even on openMSX, which seems to emulate this behavior).
In the other 2 cases, there are absolutely no write failures observed.
We'd need to do more tests to understand exactly what's going on, but that's not a subject I want to delve into. The information I'm interested in, and for which I created VATT, is solely to know the real minimum valid access times. The only recommendation I gave in the MSXgl doc is to avoid direct access to a VRAM area on which a VDP command is currently executing.
As for me, I don't really understand these results.
What don't you understand?
The tool simply allows to write to VRAM at different speeds and in different contexts, and check if write failure have occurred.
The VATT test protocol is explained there: https://aoineko.org/msxgl/index.php?title=VRAM_access_timing...
In the 1st case, we observe write failures at certain access speeds (on real machines and even on openMSX, which seems to emulate this behavior).
openMSX does not emulate any "write failures". Instead what (I assume) is happening is.
From a high level:
- The VDP command does an "OR 0", this should have no effect (they don't alter the value in VRAM)
- The CPU writes bytes in the same VRAM region. These writes do change the value in VRAM.
- It appears as if some CPU writes were not executed.
In more detail:
- The VDP command reads a value from VRAM, then it needs a bit of time to calculate the result (masking the correct nibble, and calculate the result of "OR 0"). It also needs to wait for an "access-slot" to write this result to VRAM.
- (Once in a while) it can happen that the CPU writes the exact same VRAM address as was just read by the VDP command. This write DOES succeed without failure.
- The VDP command engine writes the result of the "OR 0" operation. This again write to the same VRAM adrress. So this restores the original value. So it only appears as-if the CPU write failed.
(I already explained this scenario before in this thread on 21/04).
To verify this hypothesis you can replace "OR 0" with e.g. "OR 1" (and also make sure the CPU writes change something more than only bit 0). (I predict that) you'll see that some bytes are overwritten by the CPU, some are overwritten by the VDP command, but there should be none that haven't changed at all (so none where the write failed).
Thank you for the detailed explanation.
It make sens now.
And this is indeed an expected behavior if the VRAM can be modified between the VDP command read and write timing.
So it is a matter of delayed overwriting the value instead timing?
In more detail:
- The VDP command reads a value from VRAM, then it needs a bit of time to calculate the result (masking the correct nibble, and calculate the result of "OR 0"). It also needs to wait for an "access-slot" to write this result to VRAM.
- (Once in a while) it can happen that the CPU writes the exact same VRAM address as was just read by the VDP command. This write DOES succeed without failure.
- The VDP command engine writes the result of the "OR 0" operation. This again write to the same VRAM adrress. So this restores the original value. So it only appears as-if the CPU write failed.
(I already explained this scenario before in this thread on 21/04).
I've made the same guess in a previous post too
What don't you understand?
I don't understand the results because, for example, on TMS9918 in SCREEN2, I have to put two NOPs between two access to the VRAM so I don't have a problem. On V9938/58, you're saying it's going to take even longer, when in fact the opposite is true: the two NOPs aren't necessary.
This is not related to VATT founding but...
OUT (n),A ; 12 ts NOP ; 5 ts NOP ; 5 ts OUT (n),A ; 12 ts (24 ts interval)
... should be not 100% safe on a MSX 1's VDP (e.g. TMS9918) while well known limit for Screen 2 with screen enable is 29 ts.
On V9938/58, limit is 15 ts so your code is safe.
When screen is disabled, MSX1 limit fall to 12 ts (or less) but, on MSX2 the limit stay ar 15 ts (this was unknown/undocumented as far as I know).
This have not effect in your test case.
But if you used a 12 ts access to the VRAM with screen disabled, IO can unexpectedly fail on MSX 2.
Here is several games that i fixed by adding 2 NOPs between two OUTI/OUT (or OUTI & JR NZ instead of OTIR).
Same for all Colecovision games I converted if I remember well.
All work fine on my two japanese MSX1s (TMS9918). Without NOPs, it only worked on MSXs with V9938/58.
Edit: Maybe it's 3 NOPs between two OUT (#98),A and two for the others. I have a doubt now.