Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform

Jun 1, 2023·

Masaya Kawamura

Masaya Kawamura

,

Yuma Shirahata, Ryuichi Yamamoto, Kentaro Tachibana

· 0 min read

PDF Code arXiv Demo

Type

Conference paper

Publication

In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing

Last updated on Jun 1, 2023

icassp tts speech

Masaya Kawamura

Authors

Masaya Kawamura

← PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech Using Natural Language Descriptions Sep 15, 2023

混合Differentiable Digital Signal Processingモデルによる合成パラメータ抽出のためのラウドネスの時間変動に基づくロス関数の設計 Sep 1, 2022 →