Announcement

Collapse
No announcement yet.

Alexa TTS that works well, same solution as for Home Assistant.

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    Alexa TTS that works well, same solution as for Home Assistant.

    just as a FYI, i have got this working directly from a windows 10 machine without the need for other machines/apps by installing the Windows Feature "Windows Linux Subsystem" then rebooting

    after which i installed the Ubuntu Linux app from the windows store, see the link for further info.

    then all i did once linux was setup is run the following commands

    apt-get update
    apt-get install jq
    apt-get install unzip
    wget https://github.com/walthowd/ha-alexa...ive/master.zip
    unzip master.zip


    now i just made the scripts executable.

    then created a secrets.yaml file in the same ~/<ExtractedFolder>

    which contains only the following info

    alexa_email: my@amazonemail.com
    alexa_password: myamazonpassword


    i then followed the setup instructions available in the github repo about getting the cookie. (word of advice i used firefox as Chrome etc.. would not work) ensure you comment/uncomment your correct region info, i also used the following Browser Line andreplaced the line in the code as i had better success with this

    BROWSER='Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36'

    once all that was up and running i just ran

    ~/<ExtractedFolder>/alexa_remote_control.sh -a

    which returned all my devices.

    and as a test i just ran

    ~/<ExtractedFolder>/alexa_remote_control.sh -d "Alexa Device Name" -e speak:"This is a test message"

    once you have output, i just created the an AlexaSpeak.bat in a directory in the windows partition with the following:

    @echo off
    set arg1=%1
    set arg2=%2
    shift
    shift
    %SystemRoot%\sysnative\wsl.exe /home/<wsl user>/<ExtractedFolder>/alexa_remote_control.sh -d %arg1% -e speak:%arg2%


    ensure you set the script name and path correctly to match your install

    after which you can run the batch file with your Alexa Device Name and Message, you can use the "ALL" instead for device name but if like me you have more than 6 devices it wont send to all, and i cannot get a speaker group to work either. but for individual notifications to device it works very well so far. but i am still testing.



    This should also work similarly on linux HS3 installs but you just need a different wrapper script.

    hope this helps as this has been my utopia for a long time as i have an alexa in every room including my toilets

    I have done a number of tests including reboots now over 2 weeks, i can see that the authentication is working even when the cookie is expired, and it does not prompt for
    Attached Files

    #2
    This is awesome, thank you!

    Comment


      #3
      I just installed this and tested it and can confirm it works. There are a few cavaets -

      1. The script assumes some of the files are in a homeassistant directory (like secrets.yaml). You'll need to edit the script with nano and give it the correct path, or create the path it expects and put the secrets.yaml in there.

      2. I had to use the firefox and cookies.txt extension, wouldn't work in chrome.

      3. This is not for the security-minded individual. Obviously your amazon.com account and password are stored clear text in a file. Pay attention to file permissions at the very least. Also, the cookies.txt file is effectively the same as your account and password as anyone who has this file has the ability to bypass login of your alexa.amazon.com account (at a minimum).

      I do wish HomeSeer rjh would consider adding direct HS3 support of TTS with Alexa. Seems like it should be possible now with some of the new hooks Amazon has provided, especially since they added device push messages.

      Comment


        #4
        FWIW, the speak command shows up in Alexa App History as "Simon Says' commands.

        Comment


          #5
          I was going to give this a try, but unfortunately the Linux Subsystem feature is not available for Server 2016 (unless you're running a Windows Insider or a Cloud Customer version of Server 2016). It is however an available feature in Server 2019, which was recently released.

          Comment


            #6
            Originally posted by logman View Post
            I was going to give this a try, but unfortunately the Linux Subsystem feature is not available for Server 2016 (unless you're running a Windows Insider or a Cloud Customer version of Server 2016). It is however an available feature in Server 2019, which was recently released.
            Server 2019 and Windows Server v1809 are still kind of a dumpster fire. I wouldn't recommend spinning one up just yet. Since the soft re-release on Nov 13 there's been more issues reported.

            Windows Server v1803 I think has the Linux subsystem? But you'd need to be able to call this script remotely, which might be possible?

            Edit:

            Correction, it has been available since 1709: https://docs.microsoft.com/en-us/win...tall-on-server

            Comment


              #7
              I'm running 1607. I think 1709 was was exclusive to or part of the Semi-Annual channel rollout where they push upgrades to you twice a year rather than the normal 3 year upgrade cycle. (I'd hate to fight that battle twice a year.)

              What we really need is for HS to step up their Alexa API integration. And Zigbee.

              Comment


                #8
                Originally posted by logman View Post
                I'm running 1607. I think 1709 was was exclusive to or part of the Semi-Annual channel rollout where they push upgrades to you twice a year rather than the normal 3 year upgrade cycle. (I'd hate to fight that battle twice a year.)

                What we really need is for HS to step up their Alexa API integration. And Zigbee.
                Yes 1709 is Semi-Annual Channel, it is also "core" only, no GUI, because SAC is only available as core. For this reason you can't run HomeSeer on a SAC Windows Server, I've tried. My first attempt at installing HS3 I went straight for a core build as I wanted low resource consumption and less frequent maintenance. The installer threw an error very early on when inputting the license. I probably could have tried copying over the program files or program data directories from another install, but at that point my thought was "if this doesn't work, what else is going to not work?".

                Hence my point that you'd need to call this script remotely.

                HS3 running on Server 2016 -> Runs a local script -> That calls another remote script on a 1709 or later SAC server with Linux susbsytem -> Passes variables to be used in text string etc.

                Comment


                  #9
                  I'm reading through the alexa_remote_control.sh at the moment, trying to get an idea of it. It's huge, no doubt, over 900 lines. But it's capable of a lot more than we're interested in, example:

                  Code:
                  usage()
                  {
                      echo "$0 [-d <device>|ALL] -e <pause|play|next|prev|fwd|rwd|shuffle|vol:<0-100>> |"
                      echo "          -b[list|<\"AA:BB:CC:DD:EE:FF\">] | -q | -r <\"station name\"|stationid> |"
                      echo "          -s <trackID|'Artist' 'Album'> | -t <ASIN> | -u <seedID> | -v <queueID> | -w <playlistId> |"
                      echo "          -i | -p | -P | -S | -a | -m <multiroom_device> [device_1 .. device_X] | -lastalexa | -l | -h"
                      echo
                      echo "   -e : run command, additional SEQUENCECMDs:"
                      echo "        weather,traffic,flashbriefing,goodmorning,singasong,tellstory,speak:'<text>',automation:'<routine name>'"
                      echo "   -b : connect/disconnect/list bluetooth device"
                      echo "   -q : query queue"
                      echo "   -r : play tunein radio"
                      echo "   -s : play library track/library album"
                      echo "   -t : play Prime playlist"
                      echo "   -u : play Prime station"
                      echo "   -v : play Prime historical queue"
                      echo "   -w : play library playlist"
                      echo "   -i : list imported library tracks"
                      echo "   -p : list purchased library tracks"
                      echo "   -P : list Prime playlists"
                      echo "   -S : list Prime stations"
                      echo "   -a : list available devices"
                      echo "   -m : delete multiroom and/or create new multiroom containing devices"
                      echo "   -lastalexa : print device that received the last voice command"
                      echo "   -l : logoff"
                      echo "   -h : help"
                  }
                  You can see the script can be used to play radio stations, playlists, create device groups. There's also some stuff in there for performing bluetooth pairing.

                  All we want to do is send a TTS to a particular echo device, so the parts of this that are relevant to our objective are considerably smaller than the whole script.

                  It seems to use CURL to issue the commands. CURL can be called just as easily from a powershell script or you could possibly use the powershell native commands Invoke-WebRequest / Invoke-RestMethod.

                  Regarding the storage of plain text passwords, Powershell allows for storing passwords and other text as a secure string, then storing that secure string as a file so that it can be imported again for future use. Secure string uses reversible encryption, but only the principal that performed the encryption can decrypt. I believe "principal" refers to the user, service or computer account.

                  What this means is if someone had access to the server and files where you run this script from, and was able to authenticate as the account that runs the script, they could theoretically decrypt your password from the stored object. It's still a significant improvement over a plain text file.

                  You might also be able to use the secure string method to store the cookie.



                  What I'm leading to here is that with a little reading and powershell translation, it should be possible to take the TTS segments of this to produce a script that can be called by HS3 running on a Win10 / Server 2016 platform.

                  Comment


                    #10
                    Below is what I've extracted, believing it to be the components relevant for text-to-speech:

                    Code:
                    # Set Amazon/Alexa regional options, US vs UK.
                    
                    
                    AMAZON='amazon.com'
                    #      'amazon.co.uk'
                    ALEXA='pitangui.amazon.com'
                    #     'layla.amazon.co.uk'
                    LANGUAGE='en-US'
                    #        'en-GB'
                    
                    
                    # Cookie for authentication.
                    
                    
                    COOKIE='YOUR COOKIE DATA HERE'
                    # I have nfi what awk is, below line seems to concatenate and sanitize data
                    CSRF="csrf: $(awk "\$0 ~/.${AMAZON}.*csrf[ \\s\\t]+/ {print \$7}" ${COOKIE})"
                    
                    
                    # Specify which device to execute command against.
                    
                    
                    DEVICESERIALNUMBER=''
                    DEVICETYPE=''
                    MEDIAOWNERCUSTOMERID=''
                    
                    
                    # Specify CURL installation location
                    
                    
                    CURL='C:\Program Files\cURL\curl.exe'
                    
                    
                    # Set text-to-speech string from passed in argument.
                    
                    
                    TEXT='YOUR TEXT HERE'
                    TTS=",\\\"textToSpeak\\\":\\\"${TEXT}\\\""
                    
                    
                    # Concatenate the actual command that will be sent as HTTP POST data
                    
                    
                    COMMAND="{\"behaviorId\":\"PREVIEW\",\"sequenceJson\":\"{\\\"@type\\\":\\\"com.amazon.alexa.behaviors.model.Sequence\\\",\\\"startNode\\\":{\\\"@type\\\":\\\"com.amazon.alexa.behaviors.model.OpaquePayloadOperationNode\\\",\\\"type\\\":\\\"Alexa.Speak\\\",\\\"operationPayload\\\":{\\\"deviceType\\\":\\\"${DEVICETYPE}\\\",\\\"deviceSerialNumber\\\":\\\"${DEVICESERIALNUMBER}\\\",\\\"locale\\\":\\\"${LANGUAGE}\\\",\\\"customerId\\\":\\\"${MEDIAOWNERCUSTOMERID}\\\"${TTS}}}}\",\"status\":\"ENABLED\"}"
                    
                    
                    # Execute the actual HTTP request
                    
                    
                         ${CURL}
                    #     --insecure
                         --compressed
                         --http1.1
                         --silent
                         --cookie ${COOKIE}
                         --user-agent "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36"
                         --header "DNT: 1"
                         --header "Connection: keep-alive"
                         --location
                         --header "Content-Type: application/json; charset=UTF-8"
                         --header "Referer: https://alexa.${AMAZON}/spa/index.html"
                         --header "Origin: https://alexa.${AMAZON}"
                         --header "${CSRF}"
                         --request POST
                         --data ${COMMAND}
                         "https://${ALEXA}/api/behaviors/preview"
                    Hopefully looking at the above, you can all begin to see how there isn't really a need for the Linux subsystem. Curl binaries are available for windows, it's a matter of collecting and combining some variables, then firing them off via CURL.

                    I've replaced the BROWSER variable with just the static string recommended by fuzzy in the original post.

                    DEVICESERIALNUMBER , I suspect you could probably find this on the bottom/sticker of an Echo device
                    DEVICETYPE , I've seen references to ECHO , KNIGHT and ROOK , suspect these probably relate to Echo, Echo Plus and Echo dot in no particular order. Again you could probably determine them with some research and write them in statically.
                    MEDIAOWNERCUSTOMERID , don't really understand what this is, your user reference I guess? Which would indicate it's not implied by the cookie/session.

                    CSRF , someone care to explain this and 'awk' to me?


                    This isn't usable script and isn't meant to be. It's just extracting and highlighting that sending the TTS probably isn't very hard once you know a few variables. Ie your device serial numbers and user ID.

                    I've been going over this at work (employee of the year...) and thought I'd throw it up here. I'm about to head home, someone else might be able to pick it back up before I do and produce something directly usable on Windows.

                    Comment


                      #11
                      Very interesting, I will see how hard it will be to add this to HS3 in some way so you don't have to go through all this configuration. We already have access to the API.
                      website | buy now | support | youtube

                      Comment


                        #12
                        Originally posted by rjh View Post
                        Very interesting, I will see how hard it will be to add this to HS3 in some way so you don't have to go through all this configuration. We already have access to the API.
                        That would be an awesome add to HS!

                        Comment


                          #13
                          Originally posted by rjh View Post
                          Very interesting, I will see how hard it will be to add this to HS3 in some way so you don't have to go through all this configuration. We already have access to the API.
                          This excites me more way more than it really should. This would be the ideal way for me to get whole house speech from Homeseer.

                          Comment


                            #14
                            We seeing if we can create a plugin for this. The open issue is the login as right now that script uses cookie info to connect. We think we can use a token and just have the user login with a dialog we will present. Will keep you updated.
                            website | buy now | support | youtube

                            Comment


                              #15
                              Wow that is awesome to read. I am also unduly excited, thanks

                              Comment

                              Working...
                              X