Is there a way to extract embeddings for prosody, content and acoustic details? #7

iv41879 · 2024-09-12T14:18:06Z

@lifeiteng Is there a way to extract embeddings for prosody, content and acoustic details? Thank you

hmohebbi · 2024-11-20T15:09:27Z

I tried this (authors may correct me(?)):

with torch.no_grad():
    enc_out = fa_encoder(test_wav)
    vq_post_emb, vq_id, _, quantized, spk_embs = fa_decoder(enc_out, eval_vq=False, vq=True)
    prosody_code = vq_id[:1]
    content_code = vq_id[1:3]
    residual_code = vq_id[3:]
    quantizer = fa_decoder.quantizer.eval()
    prosody_embedding = quantizer[0].vq2emb(prosody_code)
    content_embedding = quantizer[1].vq2emb(content_code)
    residual_embedding = quantizer[2].vq2emb(residual_code)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there a way to extract embeddings for prosody, content and acoustic details? #7

Is there a way to extract embeddings for prosody, content and acoustic details? #7

iv41879 commented Sep 12, 2024

hmohebbi commented Nov 20, 2024

Is there a way to extract embeddings for prosody, content and acoustic details? #7

Is there a way to extract embeddings for prosody, content and acoustic details? #7

Comments

iv41879 commented Sep 12, 2024

hmohebbi commented Nov 20, 2024