Speech Recognition, Translation, and Text-to-Speech on iOS
This is a powerful, yet simple solution for demonstrating the power of machine learning on mobile using managed cloud services. The app provides speech recognition via Apple Speech API, text translation via Amazon Translate, and showcases speech synthesis using Amazon Polly to read back our translated text!
Of all the AWS services, Amazon Translate is by far the easiest to implement into your app. Amazon Polly is a close 2nd. So, if you have never used AWS before and want to try adding some machine learning to your mobile app, now is the time! And, it only takes less than 5 minutess for both backend and client configuration.
There are two easy steps to building this solution: Part 1. Configure backend by creating an Amazon Cognito Identity Pool, IAM Role(s), and adding permission to those roles for accessing Amazon Translate and Polly directly from a mobile app. Part 2. Create a mobile app to showcase natural language processing by cloning my sample app from GitHub and configuring it to use the values created in step #1.
PART 1: Configure Backend (1 minute)
I created a CloudFormation template to automate the creation of a Cognito Identity Pool, IAM Role(s), and permissions. The other services (Translate & Polly) do not require any backend configuration and will be called directly from our mobile app. Note: Creating a CloudFormation Stack to provision the above AWS resources is FREE.
Click on the Launch Stack button
This will launch the AWS CloudFormation Console, passing in the template, create a new stack, and automate the creation of a Cognitio Identity Pool, associated authenticate & authenticated IAM Roles along with policies for accessing Amazon Translate and Amazon Polly directly from a mobile app.
Click Next on the Select Template page
On the Options page, leave all the defaults and click Next
On the Review page, check the box to acknowledge that CloudFormation will create IAM resources and click Create.
Wait for the speechtranslator-stack stack to reach a status of CREATE_COMPLETE
With the speechtranslator-stack selected, click on the Outputs tab and you should see three rows. We only need the identityPoolId for now.
Copy the Value for just the identityPoolId as we’ll be pasting this value into the
AWSConfiguration.jsonfile in our Xcode project.
PART 2: Mobile Client Setup (3 1/2 minutes)
In this part, we'll clone the repo, update Cocoapods, and update the AppDelagate.swift file with your own backend Identity pool Id and IAM Roles generated in PART 1.
Download or clone this project
$ git clone https://github.com/mobilequickie/AmazonSpeechTranslator.git $ cd AmazonSpeechTranslator
$ sudo gem install cocoapods $ pod install --repo-update
Launch project in Xcode
$ open SpeechRec.xcworkspace
Update the AWSConfiguration.json by pasting in your own identityPoolId from the output tab of CloudFormation Stack that you created in Part 1, step #7.
Build and run the app
- Cocoapods 1.5.0 +
- iOS 10.2+ / Mac OS X 10.13+
- Xcode 9.0+
- See THIRD-PARTY-LICENSES.txt
- Pulsator - Used for animating the live listener
- DropDown - Used as a dropdown for selecting the different languages
Dennis Hills (Mobile Quickie) - Initial work