speech | Masaya Kawamura

BitTTS: 1.58-bit 量子化と重みインデキシングによる軽量なテキスト音声合成

Sep 11, 2025

SLASH: 信号処理と自己教師あり学習を組み合わせた基本周波数推定法

Sep 10, 2025

SLASH: Self-Supervised Speech Pitch Estimation Leveraging DSP-derived Absolute Pitch

Sep 1, 2025

Comparative Analysis of Fast and High-Fidelity Neural Vocoders for Low-Latency Streaming Synthesis in Resource-Constrained Environments

Sep 1, 2025

BitTTS: Highly Compact Text-to-Speech Using 1.58-bit Quantization and Weight Indexing

Sep 1, 2025

Description-based Controllable Text-to-Speech with Cross-Lingual Voice Control

Apr 6, 2025

LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning

Sep 1, 2024

PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech Using Natural Language Descriptions

Sep 15, 2023

Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform

Jun 1, 2023