Text to speech
- Vous devez vous identifier ou créer un compte pour écrire des commentaires
Hello everyone.
Text to speech... what are your favorite engines/GUIs/voices/settings?
I am currently working on a project that I will need to use text to speech, I have been testing around with Gespeaker, and using settings as follow:
Language: english_rp
Voice: male
Pitch: 27
Volume: 89
Speed: 170
Delay: 6
It is tolerable for a short period of time, you can listen to it as it speaks in a clear way, but the sound has a lot of "hiss" and "roboticness". Is not something you want to listen for a long period of time. Since the presentation I am trying to build will be around 10 mins, I was wondering if one can do something to have a "better voice". I first searched MBrola, but that is not free software (voices are binary only). I also heard of festival, but apparently you can't use it through a GUI, which means it is not very easy to change settings and make small tests in a single piece of text.
I don't know any other text to speech free software. I would be willing to use a website that provides a good service as long as it was free (and had no words limits, since it is a big presentation).
Or if you know how to improve Gespkear, please let me know :)
I'm currently working on a free software replacement for the non-free mbrola.
The hardest part of building a speech synthesis system is actually the creation of a voice library. I decided to use human speech recordings instead of formant synthesis. For me it began in 2011 when I was looking for a singing synthesizer software. I found many nonfree programs such as Myriad Virtual Singer, OGI Flinger, Vocaloid and UTAU. As I was unable to find a free replacement, I decided to write one. In the meanwhile I found out that some plugins for UTAU are free software, but I still had to replace the nonfree GUI, which is also trapped by Windows. One existing GPLv3 UTAU plugin is v.Connect-STAND [1], which is based on WORLD[2]. v.Connect-STAND has a more natural sound[3] than eCantorix[4], but it is limited to the Japanese language. I was able to compile it, but I do not know how to use it.
My free program will be based on WORLD, and it will allow speech/singing synthesis by Collaborative Creation. The algorithms used in WORLD are described in [5]. I chose a design that makes it possible to be multilingual.
[1] http://hal-the-cat.music.coocan.jp/ritsu_e.html
[2] http://ml.cs.yamanashi.ac.jp/world/english/
[3] https://www.youtube.com/watch?v=to28rvoNYfY
[4] https://github.com/divVerent/ecantorix/wiki/Songs
[5] http://iwk.mdw.ac.at/lit_db_iwk/download.php?id=18114
Long ago that I do not use such programs, but I remember the following pages helped me to improve voice quality:
http://www.muflone.com/gespeaker/english/mbrola.html
www.voxforge.org/home
I'm happy with espeak as shipped in Trisquel, and use it for hours at a time, with Orca screen reader. I find the default British voice more comfortable than the included US-English. Attached, you'll find a shell script that downloads and installs a supposedly improved US English voice called Klatt4. You can edit parameters in the voice file to your liking.
Pièce jointe | Taille |
---|---|
install-espeak-voice.sh | 673 octets |
Can you attach the shell script in a .gz file? I get a 403 (Forbidden) when I try to download shell scripts from Trisquel's web site.
Here's the requested script as .gz file.
Pièce jointe | Taille |
---|---|
install-espeak-voice.sh.gz | 374 octets |
First, thanks to everyone who has commented.
So far, and from what I understand, festival is a more "different voices" compatible software than Gespeaker or Espeaker in general. So, I have decided to go with that. From what I understand, there are free and non-free voices made to work with Festival, from mbrola, CMU, festvox, etc.
Festvox are free and are available on the repositories. I like Ked, it sounds nice enough. However, I am having trouble getting it to work in my computer, so I decided to use the online service http://www.festvox.org/voicedemos.html which uses the same engine and voice that I was trying to use in my computer, to make my text-to-speech. Since this is a presentation that will be made public I don't see any trouble in using a online service :)
So, thanks everyone, and I hope text-to-speech will improve soon enough in GNU world. Speech recognition software is already in good development (Simon for example, even if I am not 100% sure of the licenses used) so I think the opposite should keep up with it :)
Where is the comment by user Angila14 ??
It appears in the main forum, marking as that user commented but I don't see that comment :X
It was spam mesg. reported and was removed by David.
Thanks for the clarification.
When it happens maybe we could have a message appearing to let people know that the comment was removed because it was spam? Or something like that, to make it less confusing.
I was really confused by not finding the new comment :P
You could file a bug, it is confusing. Just don't hold your breath waiting for a patch though.
- Vous devez vous identifier ou créer un compte pour écrire des commentaires