Loading [MathJax]/jax/element/mml/optable/BasicLatin.js

Monday, September 16, 2013

Automatic Speech Recognition with Google

I recently found out that Google Chrome has a speech recognition component.



(Yes, this video is already 2 years old!) This component utilizes a Web service to do the actual speech recognition. The Web service isn't available officially, but can be accessed without Chrome to perform some quick speech recognition. Here's a script that does the work for you:

#!/bin/sh
#
# Adapted from: http://sunilkumarkopparapu.wordpress.com/2012/09/11/using-google-asr/
#
LANGUAGE="en-US"
echo "1 SoX Sound Exchange -- Convert 1.wav to 1.flacC with 16000"
sox 1.wav 1.flac rate 16k
echo "2 Submit to Google Voice Recognition $1.flac"
wget -q -U "Mozilla/5.0" --no-proxy --post-file 1.flac --header="Content-Type: audio/x-flac; rate=16000" -O 1.ret "http://www.google.com/speech-api/v1/recognize?lang=${LANGUAGE}&client=chromium"
echo "3 SED Extract recognized text"
cat 1.ret | sed 's/.*utterance":"//' | sed 's/","confidence.*//' > 1.txt
echo "4 Remove Temporary Files"
#rm $1.flac
#rm $1.ret
echo "5 Show Text "
cat $1.txt
view raw google_asr.sh hosted with ❤ by GitHub
The script is mostly based on existing work by Sunil. I had to modify a few things to get it working:

  • Fix the fancy quotation marks
  • Fix the wget line (it wasn't outputting to the correct file)
Other than that, full credit should go to Sunil.  

It only seems to work well for smaller files (no more than a couple of seconds of audio). For larger file, the response from the Web service appears to be empty.

You can find more details about the API in this gist by alotaiba.

No comments:

Post a Comment