Puppy Linux Discussion Forum Forum Index Puppy Linux Discussion Forum
Puppy HOME page : puppylinux.com
"THE" alternative forum : puppylinux.info
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

The time now is Sun 31 Aug 2014, 04:17
All times are UTC - 4
 Forum index » Advanced Topics » Puppy Projects
Chatterbox - STT / TTS / TTA project. Part 2
Moderators: Flash, JohnMurga
Post_new_topic   Reply_to_topic View_previous_topic :: View_next_topic
Page 1 of 6 Posts_count   Goto page: 1, 2, 3, 4, 5, 6 Next
Author Message
greengeek

Joined: 20 Jul 2010
Posts: 2509
Location: New Zealand

PostPosted: Fri 11 Oct 2013, 17:36    Post_subject:  Chatterbox - STT / TTS / TTA project. Part 2
Sub_title: Make Puppy listen.
 

Part 2 of my "chatterbox" project is aimed at getting a Puppy to monitor the microphone and listen to my response and create a text file which accurately reflects what I have spoken.

Part 1 and chatterbox project description here:
http://www.murga-linux.com/puppy/viewtopic.php?t=89258

Part 3 (Making puppy act on decoded commands) here:
http://murga-linux.com/puppy/viewtopic.php?t=89260

Progress so far is based on the 'pocketsphinx_continuous" .pet offered by technosaurus here:
http://www.murga-linux.com/puppy/viewtopic.php?t=88095&start=27
.

Edited_times_total
Back to top
View user's profile Send_private_message 
greengeek

Joined: 20 Jul 2010
Posts: 2509
Location: New Zealand

PostPosted: Fri 11 Oct 2013, 17:36    Post_subject:  

reserved
Back to top
View user's profile Send_private_message 
greengeek

Joined: 20 Jul 2010
Posts: 2509
Location: New Zealand

PostPosted: Fri 11 Oct 2013, 17:36    Post_subject:  

reserved
Back to top
View user's profile Send_private_message 
greengeek

Joined: 20 Jul 2010
Posts: 2509
Location: New Zealand

PostPosted: Fri 11 Oct 2013, 17:37    Post_subject:  

reserved
Back to top
View user's profile Send_private_message 
greengeek

Joined: 20 Jul 2010
Posts: 2509
Location: New Zealand

PostPosted: Fri 11 Oct 2013, 17:37    Post_subject:  

reserved
Back to top
View user's profile Send_private_message 
H4LF82


Joined: 02 Oct 2012
Posts: 124

PostPosted: Fri 11 Oct 2013, 21:04    Post_subject:  

This is going to be the tough bit. Getting your computer to understand even one single word is tough enough, never mind the entire English language.

For these purposes, however, even the ability to discern between 2 words like "yes" and "no" would be extremely helpful.

Ive heard to try sphinx, verbio, ubuntu, and all manner and sorts of other things, but I have not had any luck with any of it. But I can tell you this much; I know when I am beaten, and there is a 6 month chunk of my life gone that I wont ever get back that I spent banging my head against this very wall (hindsight being 20/20, I'd avoid Sphinx if I were you), so by all means, please have a go at it...

I look forward to seeing what comes of it! Very Happy

_________________
"The wise know their weakness too well to assume infallibility; and he who knows most, knows best how little he knows." - Thomas Jefferson
Back to top
View user's profile Send_private_message 
greengeek

Joined: 20 Jul 2010
Posts: 2509
Location: New Zealand

PostPosted: Fri 11 Oct 2013, 22:05    Post_subject:  

Sorry to hear of your experience with Sphinx - I was getting my hopes up this morning when technosaurus posted a pet of pocketsphinx
http://www.murga-linux.com/puppy/viewtopic.php?t=88095&start=27

Never mind, I will give it a go. As you say, teaching it the difference between "yes" and "no" is all that is required to make a start. To be honest I've read a few posts that suggest it is a mistake to use short words with STT - better to try to teach it the difference between "affirmative" and "not bloody likely" - apparently the longer phrases are easier to decode reliably.
Back to top
View user's profile Send_private_message 
greengeek

Joined: 20 Jul 2010
Posts: 2509
Location: New Zealand

PostPosted: Sat 12 Oct 2013, 04:32    Post_subject:  

Well, I've been playing with pocketsphinx and it seems to be pretty good at decoding what I'm saying. I can certainly get it to distinguish yes and no with excellent reliability. Surprisingly it also seems very good (sometimes) at assembling entire sentences - although the accuracy does vary if the room has background noise.

I found that the program itself was extremely sensitive to mic volume and it was necessary for me turn the capture volume right DOWN to almost nothing, and to turn OFF the 20db mic boost which is usually a necessity with all other audio programs like mhwaveedit etc. Quite surprising.

The problem is what to do with the output of the recognition program? I can see the decoded speech in the terminal but how to feed it to a text file in real time??

Technosaurus mentioned the following tutorial:
http://hackaday.com/2010/07/11/adding-speach-recognition-to-your-embedded-platform/
and one of the comments was as follows:

Quote:
I have a robot and I want to use Pocketsphinx so I can talk to the robot thing like…where is this room and it will tell me where it is or move foward and it should move forward. Right now I have install pockectsphinx.07 and sphinxbase and when I run using ubuntu 10.04LTS: pocketsphinx_continuous -lm 1998.lm -dict .dict 1998.dic it say READY then listening the when I say something like Good morning it write back Goodmorning….But how do I go from here…how do I use pocketsphinx to allow me to just talk and have what I just said be recorded and send to my robot to move…PLEASE HELP
To which the author replied:
Quote:
Hello Steve
The way to connect recognizer library output to an action is a standard task every programmer could solve. I suppose you need to learn how to write programs. I’m sure you could find quite some references on the web. If you learn Python for example you can do it in a minute. For futher questions please use CMUSphinx forums
http://cmusphinx.sourceforge.net/wiki/communicate
So - not being a programmer, I'm stuck.

Technosaurus makes the following comment:
Quote:
One way to handle the output from speech recognition is to use /dev/stdout as the output and pipe it through a while-read-case block like:
Code:
   
pocketsphinx_continuous <params>| while read LINE; do
case "$LINE" in
  *)...;; #use different regex here for different actions
esac;
done

I will need to scavenge the CMUSphinx forums and learn what all this means and see if there are any examples that give me some clues how to finetune this for puppy.
Back to top
View user's profile Send_private_message 
H4LF82


Joined: 02 Oct 2012
Posts: 124

PostPosted: Sat 12 Oct 2013, 13:04    Post_subject: sphinx  

if we can practice on Lucid ill give it a go...

gimme a few to get caffiene and im on it...

_________________
"The wise know their weakness too well to assume infallibility; and he who knows most, knows best how little he knows." - Thomas Jefferson
Back to top
View user's profile Send_private_message 
greengeek

Joined: 20 Jul 2010
Posts: 2509
Location: New Zealand

PostPosted: Sat 12 Oct 2013, 13:28    Post_subject:  

What version of Lucid are you using? (could you post the bottom few lines of your /etc/DISTRO_SPECS file?
Back to top
View user's profile Send_private_message 
H4LF82


Joined: 02 Oct 2012
Posts: 124

PostPosted: Sat 12 Oct 2013, 13:39    Post_subject:  

lucid 5.2.8
Code:

One or more words that identify this distribution:
DISTRO_NAME='Lucid '
#A three-digit numeric value, version number of this distribution:
DISTRO_VERSION=528
#A two-digit numeric value, minor-version number of this distribution:
DISTRO_MINOR_VERSION=00
#The distro whose binary packages were used to build this distribution:
DISTRO_BINARY_COMPAT='ubuntu'
#Prefix for some filenames: exs: lupusave.2fs, lupu-528.sfs
DISTRO_FILE_PREFIX='lupu'
#The version of the distro whose binary packages were used to build this distro:
DISTRO_COMPAT_VERSION='lucid'
#the kernel pet package used:
DISTRO_KERNEL_PET='linux_kernel-2.6.33.2-tickless_smp_patched-L3.pet'
#16-byte alpha-numeric ID-string appended to vmlinuz, lupu_528.sfs, zl528332.sfs and devx.sfs:
DISTRO_IDSTRING='l528120404231153'
#Puppy default filenames...
#Note, the 'SFS' files below are what the 'init' script in initrd.gz searches for,
#for the partition, path and actual files loaded, see PUPSFS and ZDRV in /etc/rc.d/PUPSTATE
DISTRO_PUPPYSFS='lupu_528.sfs'
DISTRO_ZDRVSFS='zl528332.sfs'

_________________
"The wise know their weakness too well to assume infallibility; and he who knows most, knows best how little he knows." - Thomas Jefferson
Back to top
View user's profile Send_private_message 
H4LF82


Joined: 02 Oct 2012
Posts: 124

PostPosted: Sat 12 Oct 2013, 13:46    Post_subject:  

if you would confirm that sphinx plays well with lucid ( i.e. no smoking HDD's) then i will give it another try. i may have had an old version last time...
_________________
"The wise know their weakness too well to assume infallibility; and he who knows most, knows best how little he knows." - Thomas Jefferson
Back to top
View user's profile Send_private_message 
greengeek

Joined: 20 Jul 2010
Posts: 2509
Location: New Zealand

PostPosted: Sat 12 Oct 2013, 14:01    Post_subject:  

When the decoded speech is extracted I thought it would be useful to have it placed into a text file called something like chatdump or voicedump or something like that - the stream of text would flow in as the user spoke, and maybe the file would need to be cleared every 3 seconds or so.

When Puppy was ready to assess the users answer to a question it would go looking at the chatdump and view the last word (or words if appropriate).

If the user was busy chatting to other people in the room this chatter would be discarded after 3 seconds, and then when it came time to answer a Puppy question the user would reach a natural break in their conversation and the chatdump would just contain their answer to that question.

Just tossing ideas into the mix....
Back to top
View user's profile Send_private_message 
H4LF82


Joined: 02 Oct 2012
Posts: 124

PostPosted: Sat 12 Oct 2013, 14:12    Post_subject:  

we are of one mind here. While I can see the merit of piping the stdout using python and then continuing in python, i would prefer to stay in the shallow end with my water wings and just write the stdout to a txt file which can then be bash-ed into submission. I can write a monitor-script to check the bash file for changes every few seconds and when they are detected, to act on them appropriately.

Arguably not as elegant as a singular python script, but i think it will do the job. Luckily there are many ways to skin a cat programmatically Smile

_________________
"The wise know their weakness too well to assume infallibility; and he who knows most, knows best how little he knows." - Thomas Jefferson
Back to top
View user's profile Send_private_message 
greengeek

Joined: 20 Jul 2010
Posts: 2509
Location: New Zealand

PostPosted: Sat 12 Oct 2013, 15:53    Post_subject:  

I've just booted into a live session of Lupu 528 and can confirm that pocketsphinx works fine.
(Interestingly I did not need to wind down the mic volume in Lupu the way I did on Upup. It worked fine in Lupu without any changes).

Steps as follows:
1) Download technosaurus pocketsphinx pet from here:
http://murga-linux.com/puppy/viewtopic.php?t=88095&start=27
2) Install the pet
3) Create a new directory of /usr/share/pocketsphinx (we will be using this later...)
4) Download the other source files referred to by technoasurus from this link:
http://hivelocity.dl.sourceforge.net/project/cmusphinx/pocketsphinx/0.8/pocketsphinx-0.8.tar.gz
5) Extract these files in your download directory and copy the "model" directory from the source into the /usr/share/pocketsphinx directory created above. (ie it becomes /usr/share/pocketsphinx/model)
6) Go into /usr/bin, rightclick in the open space and choose "window, terminal here"
7) Type: #./pocketsphinx_continuous
You should see sphinx set itself up and eventually show a "Ready" prompt. At that point you can speak into your microphone and you should see it say "listening..." and then once you stop speaking it will try to decode what you said.

Try saying "negative" or "affirmative" - I found the detection of those words to be 100% accurate if I used an American accent (ie: roll the r slightly in affirmative, just like Mr Spock would have.)

(The biggest problem is I keep spelling "shpinx" wrong a million times).
Back to top
View user's profile Send_private_message 
Display_posts:   Sort by:   
Page 1 of 6 Posts_count   Goto page: 1, 2, 3, 4, 5, 6 Next
Post_new_topic   Reply_to_topic View_previous_topic :: View_next_topic
 Forum index » Advanced Topics » Puppy Projects
Jump to:  

Rules_post_cannot
Rules_reply_cannot
Rules_edit_cannot
Rules_delete_cannot
Rules_vote_cannot
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001, 2005 phpBB Group
[ Time: 0.0838s ][ Queries: 12 (0.0051s) ][ GZIP on ]