Masaya Kawamura1, Tomohiko Nakamura1, Daichi Kitamura2, Hiroshi Saruwatari1, Yu Takahashi3, Kazunobu Kondo3
1The University of Tokyo, Tokyo, Japan
2National Institute of Technology, Kagawa College, Kagawa, Japan
3Yamaha Corporation, Shizuoka, Japan
This is an accompanying page and includes some examples of the synthesized signals obtained with the proposed and conventional methods. The mixture and groundtruth signals are from the University of Rochester multimodal music performance (URMP) dataset [2] and the PHENICX-Anechoic dataset [3, 4]. These audio signals are not included in the training data.
Dataset | Mixture | Instrument | Ground Truth | SISS+DDSP | SISS+Proposed | SI-Proposed |
---|---|---|---|---|---|---|
URMP dataset |
Viola/Flute
| Viola | ||||
Flute | ||||||
Flute 1/Flute 2
| Flute 1 | |||||
Flute 2 | ||||||
PHENICX-Anechoic dataset |
Cello/Double bass
| Cello | ||||
Double bass | ||||||
Flute 1/Flute 2
| Flute 1 | |||||
Flute 2 | ||||||