API Platform|Speech v3 - MSSDK - Speech to Text Custom

Please Sign in to reply to a thread or create a new forums post. Not yet a member? Join Now.

by Replied by nickyoung68

4 Posts

4 Comments

Joined Sep 23, 2012

Speech v3 - MSSDK - Speech to Text Custom Apr 18, 2013 01:34 PM

Hi,   Has anyone else tried using the new custom S2T functionality of version 3 SDK?  I just downloaded the new .dll file and included it with my project.  I don't see a SpeechToTextCustom method anywhere in the RequestFactory class.  Has the SDK been updated to include this new method?  If so, where can I find it?   I find AT&T's documentation lacks quite a bit of information. It doesn't go into ver much detail about what the properties actually mean or do.  Does anyone know of better documentation somewhere? Thanks, Nick

by Replied bybs595r

0 Posts

177 Comments

Joined Jan 12, 2012

'Re: Speech v3 - MSSDK - Speech to Text Custom' Apr 19, 2013 08:49 AM

Hi Nick,

Currently the SpeechToTextCustom method has not been implemented in the MSSDK. This is planned for the next iteration.

We are also working to update the documentation, including tutorials that will help you get up to speed faster.

Apologies for the incovenience,

~brett

by Replied bynickyoung68

4 Posts

4 Comments

Joined Sep 23, 2012

'Re: Speech v3 - MSSDK - Speech to Text Custom' Apr 19, 2013 08:59 AM

Hi Brett,

Thanks for your reply. Is there an open source version of the SDK? I think this would make it much easier for community contributions and creating custom call methods to the API.

Thank you,

Nick

by Replied bybs595r

0 Posts

177 Comments

Joined Jan 12, 2012

'Re: Speech v3 - MSSDK - Speech to Text Custom' Apr 19, 2013 09:02 AM

Hi Nick,

Currently there is not an open source version. Good suggestion though. I'll submit that internally.

Cheers,

~brett

by Replied bydattapi

0 Posts

56 Comments

Joined Jan 17, 2013

'Re: Speech v3 - MSSDK - Speech to Text Custom' Apr 19, 2013 11:33 AM

Some additional thoughts on the documentation portion of this thread. Remember that the ATT Developer API is, at it�s core, a publicly exposed set of RESTful web services. The SDKs are primarily platform specific wrappers around these non-platform specific web services. Therefore, you can always find pretty detailed information on any API in the base documentation (not necessarily specific to any particular platform or SDK). For speech, V3 it�s here: http://developer.att.com/developer/basicTemplate.jsp?passedItemId=13100102&api=Speech&version=3&method=&provider=.

Since this discussion is about the MS SDK, which is built on top of the .Net framework, you can use an IL disassembler (your choice as to which one you prefer) to look into the disassembled source code. This is an open window into the existing SDK code. Furthermore, the RequestFactory class, which I�m just holding up as example for this particular case, is a public class. Therefore, if you ever find yourself not wanting to wait for the next SDK which promises to contain a particular feature, you can either implement your own sub class and/or .Net extension methods (http://msdn.microsoft.com/en-us/library/vstudio/bb383977.aspx).

-David

by Replied byepaulo

2 Posts

6 Comments

Joined May 28, 2013

'Re: Speech v3 - MSSDK - Speech to Text Custom' Jun 12, 2013 01:32 PM

Hi David,

I too have tried to use speechToTextCustom (speech v3). Using simple https post via ajax. Get an http 502 error. My Request Headers seem ok. Looks something like this.

CONNECT api.att.com:443 HTTP/1.1
Host	api.att.com
User-Agent	Appcelerator Titanium/0.0.0 (iPhone Simulator/6.1; iPhone OS; en_US;)
Connection	keep-alive
Proxy-Connection	keep-alive
Accept: application/json
Content-Type: multipart/x-srgs-audio;  boundary=----12345568790
X-SpeechContext: GrammarList
Content-Language: en-us
Authorization: Bearer cea92d36ce3bed321378826b24ed5b5a
Content-Length: 132446
Accept-Encoding: gzip

------12345568790
Content-Disposition: form-data; name="x-grammar"
Content-Type: application/srgs+xml

<grammar root="top" xml:lang="en-US"> 	<rule id="app">		<one-of>		  <item><ruleref uri="#main" /></item>		</one-of>	</rule>	<rule id="main">		<one-of>			<item>hello world</item>			<item>hello</item>		</one-of>	</rule></grammar>

------12345568790
Content-Disposition: form-data; name="x-voice"; filename="myfile.wav"
Content-Type: audio/x-wav

UklGRiR4AQBXQVZFZm10IBAAAAABAAEARKwAAIhYAQACABAAZGF0YQB4AQCjAJEAiwCHAIAA
aABIADcAIgACAOb/yv+2/7v/uP+U/3f/ef+G/5b/nv+Z/5r/rP/H/+3/FgA7AGQAhgCRAJcA
owCqALgAzgDJALgAtwCsAJgAnACsAKwAnAB9AGQAXABJADIAIwAIAPX/+//q/8f/y//X/9L/
5/8PABwAGgAsAFEAbgBtAFoAUABaAHUAhgCPAKQArwCqALIAtgCeAIkAbAA9ACUAGgD5/97/
z/+0/6X/pv+c/4z/ff9x/2//Z/9X/1z/aP9i/17/af91/4H/mP++/+L/9f8AAAYACAAcADUA
NAAuAC0AHAAGAP7/6v/M/8T/vv+R/1r/SP8+/x7/C/8K/wD/+/4F/xb/Lf8+/0L/Uf9i/2z/
iP+i/6z/v//J/8H/z//g/+L/+f8KAPP/7f/+//f/3f/O/8f/vP+r/6z/tf+i/5L/mP+I/3n/

... etc the rest of audio file ...

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAA

------12345568790

Any ideas?

Thanks,

Eduardo

by Replied bydattapi

0 Posts

56 Comments

Joined Jan 17, 2013

'Re: Speech v3 - MSSDK - Speech to Text Custom' Jun 12, 2013 02:06 PM

Hello Eduardo,

When I look at your request, I don't see the actual endpoint for custom speech being called - https://api.att.com/speech/v3/speechToTextCustom

If that's not the issue, please try a simple call to speechToText with a good audio clip to see if you get the same response from that.

If that doesn't work, please raise a ticket. Attach the files you are using and your app name
.

Thank you,

-David

by Replied byepaulo

2 Posts

6 Comments

Joined May 28, 2013

'Re: Speech v3 - MSSDK - Speech to Text Custom' Jun 12, 2013 02:22 PM

Hi David,

Thanks -- I'll look to see if Titanium httpClient has problems with posting via https protocol. I was assuming the request line showing the endpoint was hidden from the network sniffer because of https protocol. When I switch to straight http protocol I see the endpoint (but a 503 response). Note the Requst Line....

POST /speech/v3/speechToTextCustom HTTP/1.1
Host: api.att.com
User-Agent: Appcelerator Titanium/0.0.0 (iPhone Simulator/6.1; iPhone OS; en_US;)
X-Requested-With: XMLHttpRequest
Accept: application/json
Content-Type: multipart/x-srgs-audio;  boundary=----12345568790
X-SpeechContext: GrammarList
Content-Language: en-us
Connection: close
Authorization: Bearer cea92d36ce3bed321378826b24ed5b5a
Content-Length: 132446
Accept-Encoding: gzip

...etc...

Anyway, yes I've raised a ticket. I've been using the plain SpeechToText method with no issue using the Titanium module (it's just that now I want multi language support and grammarFile support).

Thanks you!

by Replied bybs595r

0 Posts

177 Comments

Joined Jan 12, 2012

'Re: Speech v3 - MSSDK - Speech to Text Custom' Jun 12, 2013 02:24 PM

Hi Eduardo,

Another thing to check is whether you are truly handling the binary audio data. It could be an char encoding issue, so you need to make sure you read and write the audio data without any char encoding (i.e. should be binary). I believe you would want the readAsBinaryString and sendAsBinary methods.

Cheers,

--brett

by Replied byepaulo

2 Posts

6 Comments

Joined May 28, 2013

'Re: Speech v3 - MSSDK - Speech to Text Custom' Jun 12, 2013 02:54 PM

Thanks for the tip Brett!

Also re: service enpoint.... after turning on SSL Proxying feature in the network sniffer (Charles), the endpoint does indeed show up.

Will look into encoding. I believe I do encode to binary base64 prior to sending.. I have also tried adding a request header "Content-Transfer-Encoding: binary" -- to no avail...

Will look further into encode possibilities.

Thanks!

Eduardo

by Replied bybs595r

0 Posts

177 Comments

Joined Jan 12, 2012

'Re: Speech v3 - MSSDK - Speech to Text Custom' Jun 12, 2013 03:03 PM

Hi Eduardo,

For the Speech payload (audio data), it is expecting binary...not base64. So, just send the data "as-is".

Cheers,

--brett