First a little warning: This is pretty twisted coding stuff again.
It started when I was thinking about GPUs. Modern GPUs are real power houses that can lift off a lot of calculation work from the CPU. I was wondering that it would be really nice to have this kind of GPU on MSX that could do the hard work, but unfortunately on MSX2 we have only this super simple blitter that can't even calculate 1+1... but then I stopped for a moment to think about it a bit harder and figured out that I'm most likely wrong...
From this thought I ended up writing this stupid little BASIC program (click to view & run). It is horribly stupid & slow, but it was definitely a proof that my initial thought was wrong... The VDP can actually calculate 1+1 and even a lot more!... You just need to change the angle you look at the VDP as a programmer.
Act 2:
Next logical thought was that ok... Multiplying things is something that is slow even on Z80, so if the VDP was able to calculate 1+1, it should in theory be able to do multiplying as well, right?
Next I wrote this another test program... This very simple BASIC program multiplies 2bit value by another 2bit value and displays the 4bit result. It is not impressive in any way, but the point is that it is the VDP that does the actual multiply calculation. You can control the individual bits by using keys 0-3 and see easily how it is done. Instead of being impressive, it was more a proof of concept and mind experiment: If you can multiply 2bit values, you can then multiply also 4bit, 8bit or 16bit if you like...
Your are probably now thinking something like "Congratulations... You managed to find the most ridiculous, slow and non user-friendly way to multiply few bits together"... That is not a bad assessment, but this is still not 100% of the story... Although visually it looks like I multiply just few bits, technically I'm actually multiplying 16x16 pictures! As the example is written on SCREEN 8 with 8bits/pixel, this program is actually doing 2048 multiplications that could all have individual parameters... Ok, ok, the bits are still in funny, 90-degrees rotated order in slow VRAM. I can't really imagine a real world situation where this could be useful, but I kind of found this to be really fascinating, so...
Act 3:
I did throw up this really messed up, over complicated BASIC program to do 8bit * 8bit calculations... As most of the time CPU is anyway just waiting the VDP, I did not put any effort to optimize BASIC part of this test code...
At first the test results were pretty promising... For measurements I took a set of 6400 integer multiplications as my test set and from poor MSX-BASIC it took about 22.5 seconds to crawl trough the task while this other MSX-BASIC program using VDP to calculate them all in parallel took only 6.5 seconds...
I anyway knew that I was fooling my self, so I modified the program X-BASIC compatible to get more meaningful results and yeah... After that the VDP version took anymore 3 seconds, but the CPU version of the test was boosted to 2.5 seconds.
So, yes... No surprises here... it is not quite as fast method as using CPU and the results are hard to fetch and use. In theory this can almost double the MSX2 number crunching speed as both CPU and VDP can work individually at a same time, but even my twisted mind can't imagine a real word application where this approach could be used for anything even semi useful... I kind of knew the end result already at start, but it was anyway something that was interesting to try and I wanted to share these results with you for comments.