Text to Speech and other Translation Types

Being able to do text to speech, speech to text, or translate formats like images or videos could be a valuable feature to add to LibreTranslate.

Does anyone have suggestions for doing this? I think the easiest way would be using a text to speech Python library that supports multiple languages.

Argos Translate may support TTS in the future but I don’t have any immediate plans. I will probably only support non-text formats if CTranslate2 supports them or if it’s possible to use a single neural net for both text and speech.