Samples from "A NEW GAN-BASED END-TO-END TTS TRAINING ALGORITHM"

Authors: Haohan Guo, FK Soong, Lei He, Lei Xie
Abstract: End-to-end (e2e), autoregressive model-based TTS has shown significant performance improvements over the conventional one. However, the autoregressive module training is affected by the “exposure bias”, or the mismatch between the different distributions of real and predicted data. While real data is available in training, but in testing, only predicted data is available to feed the autoregressive module. By introducing both real and generated data sequences in training, we can alleviate the effects of the exposure bias. We propose to use Generative Adversarial Network (GAN) along with the key idea of “Professor Forcing” in e2e training. A discriminator in GAN is jointly trained to equalize the difference between real and predicted data. In AB subjective listening test, the results show that the new approach is preferred over the standard transfer learning with a CMOS improvement of 0.1. Sentence level intelligibility tests show significant improvement in a pathological test set. The GAN-trained new model is also more stable than the baseline to produce better alignments for the Tacotron output.

FROM LEFT TO RIGHT: TF(BASELINE), SS, TF-GAN, SS-GAN

CMOS TEST

“The shelters are full , and the rents keep going up .”

“And the best thing about this app is that it's free .”

“Constantly bumping heads while on the job , the sparring coworkers can't seem to find common ground .”

“David was selected to live in this tiny home by staff at the Harry Tompson Center .”

INTELLIGIBILITY TEST

“Failed zero point zero zero percent two zero zero two zero zero zero zero Internal , Exchange , Protocols , Exch fifty , BVT Protocols , Exch fifty , BVT_log , xml Error !”

“windiff backslash backslash saeedn backslash exptsp backslash transmt backslash src backslash smtpsink backslash msgfilter backslash backslash saeedn backslash exptsp backslash transmt backslash src backslash smtpsink backslash IMFPostCat Thanks - Saeed Hey Matt , Any progress on this ?

“DUB - OWA - zero one JPN - OWA - zero one RED - OWA - zero one RED - OWA - zero two SIN - OWA - zero one SIN - OWA - zero two SYD - OWA - zero one SYD - OWA - zero two Corporate MSG Servers Server Name Exchange Version OS Version Able to Upgrade to WS two thousand and three ?”