(Yes, this video is already 2 years old!) This component utilizes a Web service to do the actual speech recognition. The Web service isn't available officially, but can be accessed without Chrome to perform some quick speech recognition. Here's a script that does the work for you:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/sh | |
# | |
# Adapted from: http://sunilkumarkopparapu.wordpress.com/2012/09/11/using-google-asr/ | |
# | |
LANGUAGE="en-US" | |
echo "1 SoX Sound Exchange -- Convert 1.wav to 1.flacC with 16000" | |
sox 1.wav 1.flac rate 16k | |
echo "2 Submit to Google Voice Recognition $1.flac" | |
wget -q -U "Mozilla/5.0" --no-proxy --post-file 1.flac --header="Content-Type: audio/x-flac; rate=16000" -O 1.ret "http://www.google.com/speech-api/v1/recognize?lang=${LANGUAGE}&client=chromium" | |
echo "3 SED Extract recognized text" | |
cat 1.ret | sed 's/.*utterance":"//' | sed 's/","confidence.*//' > 1.txt | |
echo "4 Remove Temporary Files" | |
#rm $1.flac | |
#rm $1.ret | |
echo "5 Show Text " | |
cat $1.txt |
- Fix the fancy quotation marks
- Fix the wget line (it wasn't outputting to the correct file)
Other than that, full credit should go to Sunil.
It only seems to work well for smaller files (no more than a couple of seconds of audio). For larger file, the response from the Web service appears to be empty.
You can find more details about the API in this gist by alotaiba.
No comments:
Post a Comment