Mimic 3 is a neural text to speech engine that can run locally, even on low-end hardware like the Raspberry Pi 4. The software speaks over 25 languages with over 100 pre-trained voices. Mimic 3 uses VITS, a “Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech”.
Source: LXer – Mimic 3 – neural Text to Speech (TTS) engine