[Hackathon 7th] 修复 `int` 与 `Value` 取 `max` 的问题 #3903

megemini · 2024-11-22T12:21:05Z

PR types

Bug fixes

PR changes

Others

Describe

修复 int 与 Value 取 max 的问题～

这里的输入会是：

[Value(define_op_name=pd_op.slice, index=0, dtype=builtin.tensor<i32>, stop_gradient=True), 2, Value(define_op_name=pd_op.slice, index=0, dtype=builtin.tensor<i32>, stop_gradient=True), Value(define_op_name=pd_op.slice, index=0, dtype=builtin.tensor<i32>, stop_gradient=True)]
[Value(define_op_name=pd_op.slice, index=0, dtype=builtin.tensor<i32>, stop_gradient=True), 1, 1, Value(define_op_name=pd_op.slice, index=0, dtype=builtin.tensor<i32>, stop_gradient=True)]

另外，paddlespeech/t2s/modules/transformer/embedding.py 中 self.pe = pe 改为 self.pe = paddle.assign(pe)，否则提示错误：

...
  File "/home/aistudio/.local/lib/python3.8/site-packages/paddle/tensor/creation.py", line 2678, in assign
    _C_ops.assign_out_(input, output)

Sorry about what's happened. In to_static mode, pd_op.assign_out_'s output variable is a viewed Tensor in dygraph. This will result in inconsistent calculation behavior between dynamic and static graphs. You must find the location of the strided ops be called, and call paddle.assign() before inplace input.If you certainly make sure it's safe, you can set env stride_in_no_check_dy2st_diff to 1.

将 stride_in_no_check_dy2st_diff=0 export 后，也可以正常运行，因此：

这里是否需要修改 self.pe = paddle.assign(pe)？
这个文件中还有多处 self.pe = pe 类似的赋值方式，是否一并修改？
是否使用 paddle.assign(pe, self.pe) 的方式？

修改后，可正常执行如下命令：

$ FLAGS_allocator_strategy=naive_best_fit FLAGS_fraction_of_gpu_memory_to_use=0.01 python3 ${BIN_DIR}/../synthesize_e2e.py   --am=fastspeech2_aishell3   --am_config=fastspeech2_canton_ckpt_1.4.0/default.yaml   --am_ckpt=fastspeech2_canton_ckpt_1.4.0/snapshot_iter_140000.pdz   --am_stat=fastspeech2_canton_ckpt_1.4.0/speech_stats.npy   --voc=pwgan_aishell3   --voc_config=pwg_aishell3_ckpt_0.5/default.yaml   --voc_ckpt=pwg_aishell3_ckpt_0.5/snapshot_iter_1000000.pdz   --voc_stat=pwg_aishell3_ckpt_0.5/feats_stats.npy   --lang=canton   --text=${BIN_DIR}/../../assets/sentences_canton.txt   --output_dir=exp/default/test_e2e   --phones_dict=fastspeech2_canton_ckpt_1.4.0/phone_id_map.txt   --speaker_dict=fastspeech2_canton_ckpt_1.4.0/speaker_id_map.txt   --spk_id=10   --inference_dir=exp/default/inference
========Args========
am: fastspeech2_aishell3
am_ckpt: fastspeech2_canton_ckpt_1.4.0/snapshot_iter_140000.pdz
am_config: fastspeech2_canton_ckpt_1.4.0/default.yaml
am_stat: fastspeech2_canton_ckpt_1.4.0/speech_stats.npy
inference_dir: exp/default/inference
lang: canton
ngpu: 1
nmlu: 0
nnpu: 0
nxpu: 0
output_dir: exp/default/test_e2e
phones_dict: fastspeech2_canton_ckpt_1.4.0/phone_id_map.txt
pinyin_phone: null
speaker_dict: fastspeech2_canton_ckpt_1.4.0/speaker_id_map.txt
speech_stretchs: null
spk_id: 10
text: /home/aistudio/PaddleSpeech/paddlespeech/t2s/exps/fastspeech2/../../assets/sentences_canton.txt
tones_dict: null
use_rhy: false
voc: pwgan_aishell3
voc_ckpt: pwg_aishell3_ckpt_0.5/snapshot_iter_1000000.pdz
voc_config: pwg_aishell3_ckpt_0.5/default.yaml
voc_stat: pwg_aishell3_ckpt_0.5/feats_stats.npy

========Config========
batch_size: 32
f0max: 400
f0min: 110
fmax: 7600
fmin: 80
fs: 24000
max_epoch: 1000
model:
  adim: 384
  aheads: 2
  decoder_normalize_before: True
  dlayers: 4
  dunits: 1536
  duration_predictor_chans: 256
  duration_predictor_kernel_size: 3
  duration_predictor_layers: 2
  elayers: 4
  encoder_normalize_before: True
  energy_embed_dropout: 0.0
  energy_embed_kernel_size: 1
  energy_predictor_chans: 256
  energy_predictor_dropout: 0.5
  energy_predictor_kernel_size: 3
  energy_predictor_layers: 2
  eunits: 1536
  init_dec_alpha: 1.0
  init_enc_alpha: 1.0
  init_type: xavier_uniform
  pitch_embed_dropout: 0.0
  pitch_embed_kernel_size: 1
  pitch_predictor_chans: 256
  pitch_predictor_dropout: 0.5
  pitch_predictor_kernel_size: 5
  pitch_predictor_layers: 5
  positionwise_conv_kernel_size: 3
  positionwise_layer_type: conv1d
  postnet_chans: 256
  postnet_filts: 5
  postnet_layers: 5
  reduction_factor: 1
  spk_embed_dim: 256
  spk_embed_integration_type: concat
  stop_gradient_from_energy_predictor: False
  stop_gradient_from_pitch_predictor: True
  transformer_dec_attn_dropout_rate: 0.2
  transformer_dec_dropout_rate: 0.2
  transformer_dec_positional_dropout_rate: 0.2
  transformer_enc_attn_dropout_rate: 0.2
  transformer_enc_dropout_rate: 0.2
  transformer_enc_positional_dropout_rate: 0.2
  use_scaled_pos_enc: True
n_fft: 2048
n_mels: 80
n_shift: 300
num_snapshots: 5
num_workers: 2
optimizer:
  learning_rate: 0.001
  optim: adam
seed: 10086
updater:
  use_masking: True
win_length: 1200
window: hann
allow_cache: True
batch_max_steps: 24000
batch_size: 8
discriminator_grad_norm: 1
discriminator_optimizer_params:
  epsilon: 1e-06
  weight_decay: 0.0
discriminator_params:
  bias: True
  conv_channels: 64
  in_channels: 1
  kernel_size: 3
  layers: 10
  nonlinear_activation: LeakyReLU
  nonlinear_activation_params:
    negative_slope: 0.2
  out_channels: 1
  use_weight_norm: True
discriminator_scheduler_params:
  gamma: 0.5
  learning_rate: 5e-05
  step_size: 200000
discriminator_train_start_steps: 100000
eval_interval_steps: 1000
fmax: 7600
fmin: 80
fs: 24000
generator_grad_norm: 10
generator_optimizer_params:
  epsilon: 1e-06
  weight_decay: 0.0
generator_params:
  aux_channels: 80
  aux_context_window: 2
  dropout: 0.0
  gate_channels: 128
  in_channels: 1
  kernel_size: 3
  layers: 30
  out_channels: 1
  residual_channels: 64
  skip_channels: 64
  stacks: 3
  upsample_scales: [4, 5, 3, 5]
  use_weight_norm: True
generator_scheduler_params:
  gamma: 0.5
  learning_rate: 0.0001
  step_size: 200000
lambda_adv: 4.0
n_fft: 2048
n_mels: 80
n_shift: 300
num_save_intermediate_results: 4
num_snapshots: 10
num_workers: 4
pin_memory: True
remove_short_samples: True
save_interval_steps: 5000
seed: 42
stft_loss_params:
  fft_sizes: [1024, 2048, 512]
  hop_sizes: [120, 240, 50]
  win_lengths: [600, 1200, 240]
  window: hann
train_max_steps: 1000000
win_length: 1200
window: hann
frontend done!
W1122 10:37:30.571856 22376 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 12.0, Runtime API Version: 11.8
W1122 10:37:30.573297 22376 gpu_resources.cc:164] device: 0, cuDNN Version: 8.9.
/home/aistudio/.local/lib/python3.8/site-packages/paddle/nn/layer/layers.py:2194: UserWarning: Skip loading for encoder.embed.1.alpha. encoder.embed.1.alpha receives a shape [1], but the expected shape is [].
/home/aistudio/.local/lib/python3.8/site-packages/paddle/nn/layer/layers.py:2194: UserWarning: Skip loading for decoder.embed.0.alpha. decoder.embed.0.alpha receives a shape [1], but the expected shape is [].
acoustic model done!
voc done!
convert am and voc to static model.
/home/aistudio/.local/lib/python3.8/site-packages/paddle/jit/dy2static/program_translator.py:747: UserWarning: full_graph=False don't support input_spec arguments. It will not produce any effect.
You can set full_graph=True, then you can assign input spec.

W1122 10:37:36.563637 22376 pd_api.cc:31283] got different data type, run type promotion automatically, this may cause data type been changed.
/home/aistudio/.local/lib/python3.8/site-packages/paddle/jit/dy2static/program_translator.py:747: UserWarning: full_graph=False don't support input_spec arguments. It will not produce any effect.
You can set full_graph=True, then you can assign input spec.

001 白云山爬过一次嘅，好远啊，爬上去都成两个钟
I1122 10:37:41.684955 22376 pir_interpreter.cc:1564] New Executor is Running ...
I1122 10:37:42.039050 22376 pir_interpreter.cc:1591] pir interpreter is running by multi-thread mode ...
001, mel: [163, 80], wave: (119700, 1), time: 5376s, Hz: 22.265625, RTF: 1077.8947368421052.
001 done!
002 睇书咯，番屋企，而家好多人好少睇书噶喎
002, mel: [237, 80], wave: (113100, 1), time: 4007s, Hz: 28.225605190915896, RTF: 850.291777188329.
002 done!
003 因为如果唔考试嘅话，工资好低噶
003, mel: [117, 80], wave: (93600, 1), time: 2628s, Hz: 35.61643835616438, RTF: 673.8461538461539.
003 done!
004 冇固定噶，你中意休边日就边日噶
004, mel: [184, 80], wave: (86400, 1), time: 2738s, Hz: 31.555880204528854, RTF: 760.5555555555555.
004 done!

@zxcd @SigureMo @Liyulingyue

paddle-bot · 2024-11-22T12:21:10Z

Thanks for your contribution!

zxcd

LGTM

[Fix] max between int and value

2d53a46

paddle-bot bot added the contributor label Nov 22, 2024

mergify bot added the T2S label Nov 22, 2024

zxcd approved these changes Nov 25, 2024

View reviewed changes

zxcd merged commit 7fd5abd into PaddlePaddle:develop Nov 25, 2024
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Hackathon 7th] 修复 `int` 与 `Value` 取 `max` 的问题 #3903

[Hackathon 7th] 修复 `int` 与 `Value` 取 `max` 的问题 #3903

megemini commented Nov 22, 2024 •

edited

Loading

paddle-bot bot commented Nov 22, 2024

zxcd left a comment

[Hackathon 7th] 修复 int 与 Value 取 max 的问题 #3903

[Hackathon 7th] 修复 int 与 Value 取 max 的问题 #3903

Conversation

megemini commented Nov 22, 2024 • edited Loading

PR types

PR changes

Describe

paddle-bot bot commented Nov 22, 2024

zxcd left a comment

Choose a reason for hiding this comment

[Hackathon 7th] 修复 `int` 与 `Value` 取 `max` 的问题 #3903

[Hackathon 7th] 修复 `int` 与 `Value` 取 `max` 的问题 #3903

megemini commented Nov 22, 2024 •

edited

Loading