Speech recognition using Salesforce® Mobile SDK and the AT&T API toolkit
By Guest Blogger Sandeep Bhanot, Principal Developer Evangelist at Salesforce.com
Let’s talk about how you can use our new toolkit to easily integrate AT&T APIs, like Speech, into apps built on the Salesforce® platform.
First, let’s set the stage with some background information.
You might already know that Salesforce offers the Salesforce Touch Platform. With it, you can write custom mobile apps for your enterprise with all the backend data and application logic being hosted on the Salesforce cloud platform. You can develop native, web or hybrid applications on the Touch Platform and leverage the proven security, reliability, and scale of the Salesforce cloud on the backend. The platform provides a Mobile SDK for native and hybrid iOS and Android development. In addition, you get enterprise grade backend features like REST APIs, Geolocation and Identity that let you build compelling enterprise mobile apps quickly and securely. You use the Apex programming language to add your backend business logic on the Platform and the Visualforce web framework to develop the HTML5 frontend (for web/hybrid apps). In a traditional MVC architecture, Visualforce forms the View and Apex forms the Controller.
AT&T has an extensive library of public APIs that developers can use to build enterprise apps and solutions. Developers can now access those APIs natively from the Salesforce platform with the AT&T API Toolkit. The toolkit provides strongly-typed Apex wrappers for RESTful AT&T APIs like speech, SMS, location, payment and more.
Now, I’ll show you a simple mobile application built using Visualforce and the Salesforce Mobile SDK that uses the AT&T Speech API via the toolkit to search for customer support issues (aka Case records) in Salesforce based on a user’s voice input. The full codebase for that application is available on Github, but let’s dissect and break down the application architecture and code here.
The figure below describes the high-level architecture for the application.
The app is built using Visualforce and JQuery Mobile and displays all Cases assigned to the currently logged in user. I then used the Salesforce Hybrid Mobile SDK to create a hybrid version of the app to install on an iPhone or Android device. When the user clicks the voice search button, the app starts capturing the microphone input from the device. The recording binary is then sent over to the Apex controller for the page. In the controller we use the AT&T Toolkit to invoke the AT&T Speech API. AT&T translates the voice input into text and returns the results back to the controller. Lastly, we perform a SOQL (a SQL like query language for querying Salesforce data) search based on the translated text and return any matching Case records to the mobile app where they are displayed to the user.
Will this Speech-to-text App only work for AT&T subscribers?
The short answer – no. Here’s the longer version. The AT&T Speech API is carrier agnostic. An app does NOT have to run on an AT&T device in order to invoke the API. In that sense, the AT&T Speech API is no different from competitive offers and can be invoked from any mobile device, no matter the underlying OS or carrier.
Developing the app
Let’s now review the key components of the app and the step-by-step process of creating it.
Installing the AT&T toolkit
The first step is to install and configure the AT&T API Toolkit in your Force.com Developer Edition environment. A Developer Edition (DE) environment is a free instance of Salesforce that lets developers build and test their applications in a controlled sandbox environment. You can sign-up for free here to get your own DE environment.
Once registered, you can install the toolkit in your DE environment by simply clicking this link. Next, you would create a free AT&T Developer Program account and configure a couple of things on the AT&T and Salesforce sides.
Building the Visualforce app
The next step is building the Visualforce page that forms the heart of the application. In addition to the voice search feature, the page provides a simple list-detail view of all Cases that are assigned to the logged in user. The CaseDemo.page is HTML5 compliant and uses JQuery Mobile to provide the general look-and-feel and navigation for the application. For a more detailed look at building a mobile friendly, list-detail HTML5 web view using Visualforce and JQuery Mobile, check out my blog series on the Cloud Hunter mobile application.
Building a Hybrid mobile app
Recording audio using PhoneGap/Cordova
A simple call to the startRecord function of the Cordova Media object (line 17) starts recording the voice input. Once the user is done speaking, they press the ‘Stop Recording’ button on the page and the following JS function is invoked.
Speech-to-text using the AT&T toolkit
Finally, let’s review what happens when the user invokes the ‘Search’ button to perform a search for matching Case records in Salesforce.
Next, take a look at that method of the CaseDemoController class.
As you can see, Apex is a strongly typed, object-oriented programming language very similar in syntax to Java or C#. Line 9 shows the first use of the AT&T API toolkit. As mentioned earlier, the toolkit provides wrapper Apex classes for invoking AT&T APIs like Speech, SMS and more. Developers don’t have to worry about the underlying plumbing of creating and parsing JSON messages, invoking the RESTful AT&T APIs, handling authentication etc. – the toolkit abstracts all that away.
The AttSpeech class for example is the wrapper class for invoking the AT&T Speech API. Lines 12-14 set the various inputs required to invoke the API, not least of which is the binary recording received from the mobile device.
We then invoke the convert() method of the AttSpeech class (line 16) to invoke the API and the translated text is returned as a AttSpeechResult object. Finally, we perform a simple SOQL query to find any Case records that match the translated text and return the result to the Visualforce page for display to the user.
Hopefully this blog has got you thinking about building new apps on the Salesforce platform. If so, you’ll want to watch the full training webinar, presented by me and Giri Bhaskara, Principal Technical Architect with the AT&T Developer Program. Check it out here. You can also ping me directly with any questions or comments.