As @marcoo pointed out above, the SE-ONE already does it on its own. I don't know how it works, though. I was discussing with a friend who has one... He said it would be perfectly possible for a game to have an MP3 soundtrack with it. Maybe it would have to be distributed with some kind of mixed media, like a microSD card for the soundtrack and something else for the game itself, since we wouldn't be able to read the microSD card while the SE-ONE is playing music. I don't really know if its card slot can be used as storage for the MSX.
EDIT: AFAIK, many of the proposed and currently developing Raspberry Pi interfaces can already do something of the sort as well. But then you are using the MSX to control an entire computer. It's a bit overkill but fun anyway.