-
Notifications
You must be signed in to change notification settings - Fork 499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how to control upsample scales #169
Comments
by the way I customized some parameters in json file as:
Could you help to see what's wrong with the setting? |
I think I solved it, I found that although I changed the upsample parameters to [2,4,4,4], the train.py did not receive the parameters, so I change the codes in build_model from
to
and this time the hparams.upsample_params can pass the upsample scale parameters from json file |
As noted in Line 75 in c0ac05e
np.prod(upsample_scales) must be equal to hop_size . This is the reason you got the assertion error.
Looks like you are using an old json file. Top-level |
Ah, I haven't updated https://github.com/r9y9/wavenet_vocoder/tree/c0ac05e41f9f563421172034e9398633df172b4f/presets, which may confuse you. I will simply delete them. |
I used the json file you provided in Hyper params URL in Pre-trained models. Do you mean we do not need the upsample_scales parameters anymore? Could you provide the new json file? I encounted the similar upsample problems when I tried to use trained model to synthesize audio files, it seems that the upsampled c's size(-1) in line 276 in wavenet.py does not match with T |
For pretrained models, please checkout the specific git commit as noted in README. |
Yeah I checkout to the specific version while trying synthesis. But for training a new model use my own data I think I kind of mixed the older version with specific version. |
hey, I'd like to ask again that although the model can be trained smoothly on the specific upsample scale, the model can't be used to synthesize the audio using same json file since the upsample network did not give input audio c exact upsample scales (for me it gives 127.xxxx instead of 128). I am not sure what may cause this problem. |
wavenet_vocoder/datasets/wavallin.py Lines 97 to 100 in 8cc0c2d
If you use our preprocessing script, upsampling is expected to work correctly. I'm not really sure what you are hitting. You might want to try pdb or ipdb debugging to isolate your problem. |
Hey, I tried to see what happened to upsample_net, I found that when specifying scales to [2, 4, 4, 4] (which supposed to upsample 128). But during training when I print the However, during synthesis which using codes in wavenet.py line 275
this time the upsample_net won't produce c.size(-1) == T |
I did some further debugging and there are still something confusing me: |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I use the default setting of [4,4,4,4] in 20180510_mixture_lj_checkpoint_step000320000_ema.json for umsample parameters, and I got an error from
in wavenet.py
I print the c and x size out: torch.Size([2, 32, 19968]), torch.Size([2, 1, 9984])
it seems its twice the size, I tried to change the parameters to [2,4,4,4] but it did not work.
Or should I change other parameters?
The text was updated successfully, but these errors were encountered: