Do you optimize on size ? or on speed ?
Or is it an option to choose from ?
The optimizer currently only prints "size" in the output, but internally it could check for both (the internal data structures contain both size and timing information, although only the size info is currently being used).
All the optimizations that are currently in are those that optimize both size and speed. But what I would like to implement is that via input flags you can tell the optimizer to activate optimizations that favor size over speed, or viceversa.
But what I would like to implement is that via input flags you can tell the optimizer to activate optimizations that favor size over speed, or viceversa.
I would go for (what I think is) easier idea: allow a parameter to choose which patterns.txt files to apply. You can have embedded (in the classpath) default.txt (current pbo-patterns.txt), size.txt and speed.txt. The option "-use default,size" will apply default.txt+size.txt, the option "-use speed" will apply speed.txt only, and the option "-use default,speed,./my-secr3t-optimizations.txt" will apply default.txt+speed.txt+an user provided pbo-patterns file.
I think is easy to implement (once you can parse a file, you can parse any of them), extensible, make easy to test additional pbo-patterns.txt files, and can even be used as a priorization system (deciding the order the patterns will be applied if found in more than one file).
But, again, writing an idea in a forum is quick; doing the actual code is... "less quick" hahaha
Yeah, that actually sounds like a better option! Should be quite trivial to add an input flag to specify the optimizer files. Hopefully I can have some updates tomorrow or so with some of these functionalities in.
All the optimizations that are currently in are those that optimize both size and speed. But what I would like to implement is that via input flags you can tell the optimizer to activate optimizations that favor size over speed, or viceversa.
Via "pragma" comments? That would be nice, so that the programmer can choose the "time-critical" and "size-critical" sections of the code. Or even what sections to not optimize at all.
In the VDP test, for example, I have some routines dedicated exclusively to introduce a delay of a known number of CPU cycles. It would be disastrous if any of these was optimized (unless there are timing-preserving optimizations, but that sounds like an atypical use case).
(apologies in advance for the long message )
Alright, I finally got the parser to recognize basic Glass and asMSX syntax, so, I can test the optimizer in some of your projects! No great optimizations yet, as the set of patterns is limited, but at least I can finally test it in projects other than my own (which helped me fix lots of bugs haha)
Here are some examples!
When running it in theNestruo's World Rally (with this commandline):
java -jar mdl.jar msx-wrally/src/rally.asm -dialect asmsx -I msx-wrally/ -warn-off-labelnocolon -po
I get this output:
PatternBasedOptimizer substitution in msx-wrally//src/rally_rom_code.asm, line 260: 1 bytes saved cp 1 Replaced by: dec a PatternBasedOptimizer substitution in msx-wrally//src/rally_rom_code.asm, line 1351: 1 bytes saved srl a srl a srl a Replaced by: rrca rrca rrca and 31 PatternBasedOptimizer substitution in msx-wrally//src/rally_rom_code.asm, line 896: 1 bytes saved ld b, NUM_HIGH_SCORES ; para saber cu�ndo dejar de comparar ld c, HI_SCORES_SIZE ; para saber cu�ntos bytes mover Replaced by: ld bc, HI_SCORES_SIZE + 256 * NUM_HIGH_SCORES PatternBasedOptimizer substitution in msx-wrally//src/rally_rom_code.asm, line 1044: 1 bytes saved ld b, NUM_HIGH_SCORES ; para saber cu�ndo dejar de comparar ld c, HI_SCORES_SIZE ; para saber cu�ntos bytes mover Replaced by: ld bc, HI_SCORES_SIZE + 256 * NUM_HIGH_SCORES PatternBasedOptimizer substitution in msx-wrally//src/rally_rom_code.asm, line 1300: 1 bytes saved ld a, SPAT_END ld (hl), a Replaced by: ld (hl), SPAT_END PatternBasedOptimizer substitution in msx-wrally//src/rally_rom_page0.asm, line 51: 1 bytes saved sla a ; deja un pixel de margen Replaced by: add a, a PatternBasedOptimizer: 6 patterns applied, 6 bytes saved
Only 6 bytes saved, but hey, again, this is only the beginning
I also tried it with Grauw's 3d engine project, when running it like this (I had to change the value of the R800 constant in the code beforehand, as I do not have support for R800 at the moment):
java -jar mdl.jar threed/src/COM.asm -I threed/lib/neonlib/src -I threed/gen -dialect glass -po
I get this output:
PatternBasedOptimizer substitution in threed/lib/neonlib/src/VDP.asm, line 21: 2 bytes saved cp 1 jr c, MSX1 jr z, MSX2 Replaced by: cp 1 + 1 jr c, MSX2 PatternBasedOptimizer substitution in threed/src/Application.asm, line 97: 1 bytes saved ld a, 0 Replaced by: xor a PatternBasedOptimizer substitution in threed/lib/neonlib/src/Memory.asm, line 321: 1 bytes saved ld a, 0 Replaced by: xor a PatternBasedOptimizer substitution in threed/lib/neonlib/src/VDP.asm, line 167: 1 bytes saved ld a, 0 ; set HR line 0 Replaced by: xor a PatternBasedOptimizer substitution in threed/src/Application.asm, line 236: 4 bytes saved ld ix, Application_points Replaced by: PatternBasedOptimizer substitution in threed/src/Application.asm, line 243: 4 bytes saved ld ix, Application_edges Replaced by: PatternBasedOptimizer: 6 patterns applied, 13 bytes saved
Again, only a few bytes saved, but hey, it's working Of course, you can get the optimizer to directly generated the optimized assembler output for you (with the -asm flag, that will generate a single asm file with your whole project, with all macros resolved, and all optimizations applied, ready to be compiled).
I verified all the proposed optimizations are actually safe in thos codebases, so quite happy about it, as some are quite tricky with even nested macros that were tricky to parse right.
I also added the functionality to prevent optimizations (if you add ; mdl:no-opt
to any line, it'll prevent any optimization, and if you don't like that pragma code, you can change it with a commandline flag). And also I added the flag to specify which optimization pattern file to use.
I'm leaving it here for today, but will continue during the weekend. My next task is to have an option to generate the output we were discussing above so that it can be parsed easily by VSCode/Sublime plugins, and after that I'll go back to improving the optimizer.
Latest version in github: https://github.com/santiontanon/mdlz80optimizer/releases/tag...
btw, the latest version should just require Java 8 now instead of 12 (I hope )
Very cool! I must try it on my other projects .
Btw if you need newer than Java 8 I would use Java 11 which is also LTS.
If you have replacements like this:
PatternBasedOptimizer substitution in msx-wrally//src/rally_rom_code.asm, line 1044: 1 bytes saved ld b, NUM_HIGH_SCORES ; para saber cu�ndo dejar de comparar ld c, HI_SCORES_SIZE ; para saber cu�ntos bytes mover Replaced by: ld bc, HI_SCORES_SIZE + 256 * NUM_HIGH_SCORES
Does it also work if b and c are assigned in reverse order, or if there is some non-related code in-between (let’s say a nop)?
Currently pattern matching is very limited, so I have two patterns one for b,c->bc and another for c,b->bc. But if there is something in between it'll currently not catch it. But that's a good point. Just added an item to my to-do list to figure out if there is an easy way to allow for that!
As for trying it on other projects, I must warn you that I only added support for the Glass syntax that was used in the "threed" project, if there is some other syntax used that I'm not supporting it'll fail. But if you try it, do let me know, and I can add support for any additional needed syntax!
Is there a way to specify include paths?
Edit: Never mind, I see you use it above, it is missing from the readme though :).
[grauw] ~/Development/vgmplay % java -jar ../mdlz80optimizer/target/mdlz80optimizer-0.2-jar-with-dependencies.jar src/COM.asm -I lib/neonlib/src -I lib/gunzip/src -dialect glass -po -popotential ERROR: expression failed to parse with token list: [,, 0] ERROR: Cannot parse line lib/gunzip/src/deflate/Alphabet.asm, 26: ds Alphabet_MAX_CODELENGTH * 2, 0 ERROR: Problem including file at src/COM.asm, 55: INCLUDE "deflate/Alphabet.asm"