www.homeseer.com    
 

Go Back   HomeSeer Message Board > HomeSeer Products & Services > HomeSeer General Discussion Area

HomeSeer General Discussion Area General discussion about HomeSeer that does not fall into any other category or are not specific to 1.x or 2.x versions of HomeSeer.

Reply
 
Thread Tools Display Modes
  #1  
Old June 19th, 2018, 11:59 PM
Timon's Avatar
Timon Timon is offline
Seer Deluxe
 
Join Date: Mar 2017
Location: Tustin, CA
Posts: 430
Software Caching Speech routines using Amazon Polly **Released**

I've been using Amazon Polly to create speech file manually but it's been a pain. And I knew I needed something better.

Thanks to some great initial work by DeLicious and his Polly.py program I'm now releasing a fully caching version called PollyC. It's available for both Python2 and Python3 versions. Use whichever version of Python you normally use. For RaspberryPi users Python2 is loaded by default so it's easiest to go with PollyC.py.

PollyC will cache all speech requests so it only has to go to Amazon's Polly servers when the speech isn't in the local cache.

It will also take advantage of ssml marker language so you can create much better TTS than you can with straight text.

It will work with HS speaker clients and should work with Spuds AirplaySpeak although it's not yet been tested.

It's be releasing this as Open Source with the only requirement being to keep the credits for DeLicious and me in the flies.

Since it's written in Python is should be portable to both windows and linux.

If there are any feature request or comments please leave them below.

Code:
--------------------------------- Usage Information ---------------------------------
PollyC
This modual is used to call the Amazon Polly system to convert an incomming string
to a audio file.
PollyC must be located in the HomeSeer directory
Calling sequence
  ./scripts/PollyC.py3 -o "outupt_file" -t "the text to speak" -c "./pollycache/" -k "key_ID" -a Key"

arguments
  -o or --ofile           Output file name
  -t or --text            Text to speak
  -v or --voiceid         The Polly voice to use (default = Joanna) 
                              see https://docs.aws.amazon.com/polly/latest/dg/voicelist.html
  -f or --format   		Output format (default - mp3)
  -c or --cache           cache directory, full, relative path or none
                              If no cache is specified then cacheing is disabled
  -k or --keyid           Amazon AWS Access Key ID, mandatory
  -a or --accesskey       Amazon AWS Access Key, mandatory
  -r or --region          Amazon Region (defaults to us-west-1) 
                              see https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html
PollyC will auto switch to ssml if it detects the string "<speak>" in the text to be converted.

For instructions on how to encode ssml speach see https://docs.aws.amazon.com/polly/latest/dg/supported-ssml.html


Future additions to PollyC
Select voice on call
There is no provision in Amazon Polly to set the voice in the text string. However 
PollyC has the ability to do this: If you specify at the beginning of the text 
string, either plain text or ssml text, <voice-id="voice name"> then that voice will be 
used and the tag deleted from the string. This tag MUST be at the beginning of the string.
Example:  '<voice-id="Matthew">This is a test'
          '<voice-id="Matthew"><speak>This is a test</speak>'

Don't cache this call.
Currently PollyC will cache only if a caching directory is specified. This addition 
will allow each call to be cached or not.
Usage: '<no-cache/>This is a test'
       '<no-cache/><speak>This is a test</speak>'

This program is free to use and distribute as long as the credits are not removed.
---------------------------------------------------------------------------------------
Here is my speak.sh file. Since aplay can't play mp3's you need an mp3 player such as mpg123
Code:
#!/bin/sh
# For Python2 change ./PollyC.py3 to ./PollyC.py
./PollyC.py3 -o "temp.mp3" -t "$1" -c "./pollycache/" -k "your key_id" -a "your key"
mpg123 -q temp.mp3
Here is my speak_to_file.sh file.
Code:
#!/bin/sh
# For Python2 change ./PollyC.py3 to ./PollyC.py
./PollyC.py3 -o "$1" -t "$2" -c "./pollycache/" -k "your key_id" -a "your key"
A little note about why there are two speak routines. The module speak.sh is used only to speak through the local systems audio channel. The module speak_to_file.sh is used whenever any remote system such as HS speaker clients or Spuds AirplaySpeak is used. If both are used both will be called which would then make 2 requests to Amazon Polly if the cacheing was not in place.

If the cacheing directory is specified but it does not exist it will be created.

PollyC.py3 and PollyC.py should also go in the HomeSeer directory.


Release Status
0.9.0 Thu 21 Jun 06:45:51 PDT 2018
Initial Release
0.9.1 Sat 23 Jun 16:49:59 PDT 2018
Updates include creation of cache directory if it does not exist.
Caching will only be preformed if cacheing directory is specified.
Additional error handling.
-h or --help is now included for easier use.
Attached Files
File Type: zip PollyC_0.9.1.zip (6.6 KB, 2 views)
__________________
John (N6BER), Joyce, Lucas (Golden Ret mix), Bella (Great Pyrenees) and Lance (GP).

HomeSeer Version: HS3 Standard 3.0.0.435
Linux version: hs3 4.14.34-v7+ #1110 SMP Mon Apr 16 15:18:51 BST 2018 armv7l GNU/Linux
Mono version: 5.12.0.226
Z-NET V2 - Version 1.0.23
Number of Devices: 257
Number of Events: 320
Available Threads: 399
HSTouch Enabled: True

3.0.0.13: AirplaySpeak
3.0.0.48: EasyTrigger
3.0.1.109: PHLocation
0.0.0.42: Pushover 3P
3.0.1.190: Z-Wave

Last edited by Timon; June 24th, 2018 at 11:25 AM. Reason: Released PollyC
Reply With Quote
  #2  
Old June 20th, 2018, 08:53 PM
Timon's Avatar
Timon Timon is offline
Seer Deluxe
 
Join Date: Mar 2017
Location: Tustin, CA
Posts: 430
Circles

Well, PollyC.py3 is working.

Currently you have manually create to the ~/HomeSeer/pollycache directory and pass the pointer to it on the PollyC command line. I'll get that fixed in the second beta.

PollyC.py3 requires Python3 and the of course the boto3 module. I'll try to get a Python2 version, PollyC.py and PollyC.py3 both out tomorrow.

This has been fun, my voice responses sound so much better than those created by the flite TTS.

OMT, This has NOT been tested under Windows Python and I don't have a Windows system to test it on. If someone wants to try it that's great and if it doesn't work I will work with you to try and get it to work.

Also, anytime you want to clear the cache all you have to do is clear out all the files in the pollycache directory and the cache will be recreated.

Last edited by Timon; June 24th, 2018 at 03:19 AM.
Reply With Quote
  #3  
Old June 21st, 2018, 10:28 AM
Timon's Avatar
Timon Timon is offline
Seer Deluxe
 
Join Date: Mar 2017
Location: Tustin, CA
Posts: 430
PollyC.py3 version 0.9.0 has been released. See the first post for more information.

The next release will allow you to specify the voice to use in the string. For now the default voice "Joanna" is used unless you change it on the command line.

Remember you must manually create the cache file, default name is pollycache, in the HomeSeer directory. PollyC.py3 should also go in the HomeSeer directory.

This has been fun to write and fun to be able to finally get good speech from HS3. Enjoy!
Reply With Quote
  #4  
Old June 23rd, 2018, 07:48 PM
Timon's Avatar
Timon Timon is offline
Seer Deluxe
 
Join Date: Mar 2017
Location: Tustin, CA
Posts: 430
PollyC version 0.9.1 has been released. See: first post for more information.

Updates include creation of cache directory if it does not exist.
Caching will only be preformed if cacheing directory is specified.
Additional error handling.
-h or --help is now included for easier use.
Reply With Quote
  #5  
Old June 24th, 2018, 04:13 AM
Timon's Avatar
Timon Timon is offline
Seer Deluxe
 
Join Date: Mar 2017
Location: Tustin, CA
Posts: 430
Samples of what Amazon's Polly sound like

I'm surprised that no one's tried this yet. For those of you that haven't heard the quality of speech that Amazon's Polly produces here are a couple of samples from my test event along with the strings sent to Polly to create them. These were done using the Joanna voice.

The first sample uses a simple text string. The second sample show what you can do with a string formatted using Speech Synthesis Markup Language (SSML). If you want to checkout the different tags go to the Amazon Polly SSML page.

Hello World! This is a text string.

<speak><prosody volume="+9dB"><emphasis level="moderate">Attention</emphasis></prosody> <break time='300ms'/> Hello World. This is a <say-as interpret-as="spell-out">ssml</say-as> string.</speak>

The following was created using the Amazon Polly web page and the Matthew voice. It could just as well be done using PollyC via HomeSeer.

<speak>
This is my original voice, without any modifications. <amazon:effect vocal-tract-length="+15%"> Now, imagine that I am much bigger. </amazon:effect> <amazon:effect vocal-tract-length="-15%">
Or, perhaps you prefer my voice when I'm very small? </amazon:effect> You can also control the
timbre of my voice by making more minor adjustments. <amazon:effect vocal-tract-length="+10%"> For example, by making me sound just a little bigger. </amazon:effect> <amazon:effect vocal-tract-length="-10%"> Or instead, making me sound only somewhat smaller. </amazon:effect>
</speak>


I think you can see just how much better the speech is compared to the what flite produces but how well you can control just how it sounds.
Reply With Quote
  #6  
Old June 24th, 2018, 08:42 AM
Tillsy's Avatar
Tillsy Tillsy is online now
Seer
 
Join Date: May 2018
Location: Australia
Posts: 74
What you've done is amazing!

The Mac lets you pipe Siri's voice to an audio file, so I simply pre-recorded (e.g. cached) what I needed and I play them when announcements are required. They're prefixed with a Star Trek TNG computer sound (alert tone for warnings, paging tone for notices).
Reply With Quote
  #7  
Old June 24th, 2018, 09:04 AM
838Joel 838Joel is offline
Seer
 
Join Date: Jan 2018
Location: Montreal
Posts: 74
I'm impressed, but now has a newbie, I need to integrate voice to my HS... Since I don't have a speaker on my server (using VM) I need to run a program in a computer that has a speaker...


Unless I can broadcast this to my Sonos system...

Anyway, I'm still in a big learning curve and so much to learn from integrating devices, then I need to figured out how to use events and now adding some speech to my stuff..

I hope to catch up fast and play with all those fantastic stuff.

Good work!
Joel

Sent from my Nexus 6P using Tapatalk
Reply With Quote
  #8  
Old June 24th, 2018, 11:24 AM
Timon's Avatar
Timon Timon is offline
Seer Deluxe
 
Join Date: Mar 2017
Location: Tustin, CA
Posts: 430
Quote:
Originally Posted by Tillsy View Post
What you've done is amazing!

The Mac lets you pipe Siri's voice to an audio file, so I simply pre-recorded (e.g. cached) what I needed and I play them when announcements are required. They're prefixed with a Star Trek TNG computer sound (alert tone for warnings, paging tone for notices).
I was doing somewhat the same using Amazons Polly to create static audio for prompts. It was just not worth the effort to keep downloading sounds to my HS system. That's way I had been looking at doing a caching Polly handler.

Quote:
Originally Posted by 838Joel View Post
I'm impressed, but now has a newbie, I need to integrate voice to my HS... Since I don't have a speaker on my server (using VM) I need to run a program in a computer that has a speaker...


Unless I can broadcast this to my Sonos system...

Anyway, I'm still in a big learning curve and so much to learn from integrating devices, then I need to figured out how to use events and now adding some speech to my stuff..

I hope to catch up fast and play with all those fantastic stuff.

Good work!
Joel

Sent from my Nexus 6P using Tapatalk
You should be able to use Spuds AirplaySpeaker then use one of several Airplay packages that run on the RaspberryPi. I'm going to be using one which will run on the RaspberryPi Zero W along with a speaker pHAT that's the same time.

Here are some packages you can look at. All of them will do Airplay with the right software loaded.
Pimoronii SpeakerHat
Pimoroni Pirate Radio
Adafruit Speaker Bonnet

There are other protocols that run on the Pi that have code for HS that should also run. Basically any remote speaker that you can send mp3 files to will work with running Caching Polly.

BTW, I would like to again thank DeLicious for finding out how to access Amazon Polly. I've been thinking about making Cashing Polly process for a while and those few lines of code allowed got me to speed up the project.
Reply With Quote
  #9  
Old June 24th, 2018, 12:33 PM
838Joel 838Joel is offline
Seer
 
Join Date: Jan 2018
Location: Montreal
Posts: 74
Thanks for the input Timon, actually I have a raspberry Pi somewhere I can use...
I'll look into it

Sent from my Nexus 6P using Tapatalk
Reply With Quote
Reply

Bookmarks

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Got better voices on HS using AWS Polly DeLicious HS3Touch™ 6 June 25th, 2018 06:25 PM
Routines kideon HomeSeer General Discussion Area 12 April 25th, 2018 11:17 PM
alexa routines music scg HomeSeer Amazon Echo Connected Home API 0 April 22nd, 2018 08:26 AM
Annonuncements using AWS Polly and Sonos sigbjorn PI-SonosController (3P) 9 March 18th, 2017 02:47 AM
AWS Polly for voices. mrardon HomeSeer Voices 0 November 30th, 2016 07:48 PM


All times are GMT -4. The time now is 06:33 AM.


Copyright HomeSeer Technologies, LLC