Diffusion-based voice conversion model for one-shot speaker style transfer. Develops a novel SDE solver for fast maximum likelihood sampling, enabling high-quality voice conversion while preserving linguistic content. Accepted as an oral presentation at ICLR 2022, demonstrating superior quality compared to state-of-the-art one-shot voice conversion approaches.

Outputs 2

DiffVC

model

Diffusion-Based Voice Conversion with Fast Maximum Likelihood Sampling Scheme

paper

arXiv: 2109.13821

audiogenerationopen-source