DC Field | Value | Language |
---|---|---|
dc.contributor.author | Seongyeon Park | - |
dc.contributor.author | Bohyung Kim | - |
dc.contributor.author | Oh, Tae-Hyun | - |
dc.date.accessioned | 2023-04-27T04:50:39Z | - |
dc.date.available | 2023-04-27T04:50:39Z | - |
dc.date.created | 2023-04-27 | - |
dc.date.issued | 2023-08-20 | - |
dc.identifier.uri | https://oasis.postech.ac.kr/handle/2014.oak/117585 | - |
dc.description.abstract | Recently, zero-shot TTS and VC methods have gained attention due to their practicality of being able to generate voices even unseen during training. Among these methods, zero-shot modifications of the VITS model have shown superior performance, while having useful properties inherited from VITS. However, the performance of VITS and VITS-based zero-shot models vary dramatically depending on how the losses are balanced. This can be problematic, as it requires a burdensome procedure of tuning loss balance hyper-parameters to find the optimal balance. In this work, we propose a novel framework that finds this optimum without search, by inducing the decoder of VITS-based models to its full reconstruction ability. With our framework, we show superior performance compared to baselines in zero-shot TTS and VC, achieving state-of-the-art performance. Furthermore, we show the robustness of our framework in various settings. We provide an explanation for the results in the discussion. | - |
dc.language | English | - |
dc.publisher | INTERSPEECH | - |
dc.relation.isPartOf | INTERSPEECH Conference | - |
dc.relation.isPartOf | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH | - |
dc.title | Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in End-to-End Zero-Shot Speech Synthesis | - |
dc.type | Conference | - |
dc.type.rims | CONF | - |
dc.identifier.bibliographicCitation | INTERSPEECH Conference | - |
dc.citation.conferenceDate | 2023-08-20 | - |
dc.citation.conferencePlace | IE | - |
dc.citation.title | INTERSPEECH Conference | - |
dc.contributor.affiliatedAuthor | Oh, Tae-Hyun | - |
dc.identifier.scopusid | 2-s2.0-85171567016 | - |
dc.description.journalClass | 1 | - |
dc.description.journalClass | 1 | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
library@postech.ac.kr Tel: 054-279-2548
Copyrights © by 2017 Pohang University of Science ad Technology All right reserved.