I agree, cpu time here isn't a problem. For human voice you should span from 450Hz to 5000Hz (maybe 6000Hz for female voices gives better results). I would use 5Hz steps or less (errors of 3Hz are usually not perceived).
Mmmm, that make 1000 parallel Goertzels, it might be a bit slow... I'm tempted to first add a real FFT to SofaCas for clean header frequency detection, converting the project to "WAV2MSXVOICE (??)" would then be easy.
@Manuel
Try this version in octave
close all; clear; path = 'wav\SKYJAGUAR\'; names = dir([path '*.wav']); nfiles = size(names,1); for ii = 1:nfiles ii name = [ path names(ii).name]; [Y,FS,NBITS] = wavread(name); if size(Y,2)>1 X = Y(:,1)+Y(:,2); else X = Y; end Wn = [450/FS, 6000/FS]; [Bbp,Abp]=butter(5,Wn); Tntsc = 1/60; Nntsc = fix(Tntsc*FS); Nblk = fix(length(X)/Nntsc); X = X(1:Nblk*Nntsc); L = length(X); t = [1:Nntsc]'; Y = zeros(Nblk*Nntsc,1); f = zeros(Nblk,8); a = zeros(Nblk,8); p = zeros(1,8); X = filter(Bbp,Abp,X); for i=1:Nblk x = X((i-1)*Nntsc+1:i*Nntsc); XF = abs(fft(x)); [pks,locs]= findpeaks(XF(1:round(Nntsc/2))); [y,j]=sort(pks,'descend'); pks = pks(j); locs = locs(j); if size(pks,1)<8 y = zeros(1,8); y(1:length(pks)) = pks; pks = y; y = round(Nntsc/2)-7:round(Nntsc/2); y(1:length(locs)) = locs; locs = y; end pks = pks(8:-1:1); locs = locs(8:-1:1); y = zeros(size(x)); freq = zeros(1,8); amp = zeros(1,8); for ti=1:8 j = locs(ti); freq(ti) = (j-1)/Nntsc*FS; amp(ti) = abs(XF(j))/Nntsc; y = y + amp(ti)*(sin(2*pi*freq(ti)*t/FS+p(ti))); p(ti) = 2*pi*freq(ti)*t(end)/FS; end Y((i-1)*Nntsc+1:i*Nntsc) = y; f(i,:) = freq; a(i,:) = amp; end sound(X,FS) sound(Y,FS) % figure (ii) % subplot(2,1,1),plot(abs(fftshift(fft(x)))); % subplot(2,1,2),plot(abs(fftshift(fft(y)))); % plot(abs(fftshift(fft(X)))); % hold on % plot(abs(fftshift(fft(Y))),'r'); % pause(0.5) % wavwrite(Y,FS,NBITS,[name 'out.wav']) TP = uint16(3579545./(32*f)); m = max(a(:)); nscc = uint8(a/m*15); npsg = 2*log2(a/m)+15; npsg(isinf(npsg))=0; npsg = uint8(ceil(npsg)); n = npsg*16+nscc; % in the same byte psg and scc volumes fid = fopen([name 'frm_scc3.txt'],'w'); for i = 1:Nblk fprintf(fid,' dw %d,%d,%d,%d,%d,%d,%d,%d \n',TP(i,1),TP(i,2),TP(i,3),TP(i,4),TP(i,5),TP(i,6),TP(i,7),TP(i,8)); fprintf(fid,' db 0x%s,0x%s,0x%s,0x%s,0x%s,0x%s,0x%s,0x%s \n',dec2hex(n(i,1),2),dec2hex(n(i,2),2),dec2hex(n(i,3),2),dec2hex(n(i,4),2),dec2hex(n(i,5),2),dec2hex(n(i,6),2),dec2hex(n(i,7),2),dec2hex(n(i,8),2)); end fclose(fid); end fid = fopen('frm_scc3.txt','w'); fprintf(fid,'nfiles: equ %d \n\n',nfiles); fprintf(fid,' page 0\n'); fprintf(fid,'frames: \n'); for ii = 1:nfiles fprintf(fid,' dw frame%d\n',ii-1); fprintf(fid,' db :frame%d\n',ii-1); end for ii = 1:nfiles fprintf(fid,' page 1..31\n'); name = [ path names(ii).name]; fprintf(fid,'frame%d: \n',ii-1); fprintf(fid,' include %s \n',[name 'frm_scc3.txt']); fprintf(fid,' db 080h\n'); end fclose(fid); !sjasm -Iasm -s sccLOFI3.asm
I agree, cpu time here isn't a problem. For human voice you should span from 450Hz to 5000Hz (maybe 6000Hz for female voices gives better results). I would use 5Hz steps or less (errors of 3Hz are usually not perceived).
Mmmm, that make 1000 parallel Goertzels, it might be a bit slow... I'm tempted to first add a real FFT to SofaCas for clean header frequency detection, converting the project to "WAV2MSXVOICE (??)" would then be easy.
Why not running the binary search for maxima 50 times in different intervals of 100Hz and choose the 8 highest results?
@Manuel
Try this version in octave
!sjasm -Iasm -s sccLOFI3.asm
If I leave out the sjasm line, I get output Not sure whether it's correct, though
I've sent you a link to my asm files.
Try them including your output
Totally true! This is why I was asking for snippets to detect OPLL and play tones at variable volumes and frequencies. OPLL is still a obscure for me, but it should be perfectly suitable for the purpose.
I don't have any ready snippets for you, but it's good that you are good in math. :) The final played frequency is result of frequency representing number (F-number), octave (BLOCK) and multiplier (MUL).
See here: https://www.msx.org/wiki/MSX-Music_programming#FM-PAC
and here: http://www.smspower.org/maxim/Documents/YM2413ApplicationMan...
(Volume explained in page 17)
@ARTAG: Do you take into account that the PSG plays square waves instead of sine waves? Maybe you could, after you found the first spectral peak, subtract this peak including all the harmonics caused by the square wave (instead of sine wave) before looking for the 2nd peak. This *might* improve sound quality a bit.
I wondered about this one, too.
I think fourier is "multiply the wave with a question-sinus, the integral is the amplitude of that question frequency".
Now what if one uses question-squares instead of sinuses.
Maybe that's all and one needs do no more about it and the resulting peaks graph is ideal for PSG.
In that graph a 100hz square would make a graph with a peak on 100hz and no peak on 300hz.
"only a 100hz instruction, the 300hz is implied", the square wave machines diagram
The whole distribution might be all slightly different, different tones turning out as the most important.
poke &hfd10,&h38 : x = usr(0) poke it before every call to usr
this poke improves the PSG much
the noise channel was on.
That and the improvements the last ROM brought and then a naked PSG goes better than before the SCC.
Noise was added on purpose to mask higher harmonics from squarewaves.
Maybe one can tune it by adding it to a weaker channel, but it improves the perceived quality (unless your audio system is severely filtering high frequencies)
Wouter
I was looking closer at the data and maybe admitting some errors one can see harmonics.
The fact is that taking into account this aspect is not easy. One could design the sccwave to match those higher harmonics.
This would give a better spectral approximation, but analysis apart (algorithm to be invented), the player should mix tones at different frequencies with different amplitudes and update the 4 waves at each interrupt.
In this way one could encode in the data only the relatve amplitudes of the higher harmonics and encode voice without making data explode