Chatterbox - STT / TTS / TTA project. Part 4

A home for all kinds of Puppy related projects
Post Reply
Message
Author
User avatar
greengeek
Posts: 5789
Joined: Tue 20 Jul 2010, 09:34
Location: Republic of Novo Zelande

Chatterbox - STT / TTS / TTA project. Part 4

#1 Post by greengeek »

My chatterbox project is about getting Puppy to respond to the verbal command of the user. Parts 1,2 and 3 were focussed on getting the basics functioning correctly. (The test was to get music playing in response to the user saying the word "music")

Part 4 is about extending this functionality to give the user more choices and preferably wide-ranging control over their PC.

Please see the next couple of posts where I will add updated versions of my prototypes:
.
Last edited by greengeek on Fri 01 Nov 2013, 19:34, edited 3 times in total.

User avatar
greengeek
Posts: 5789
Joined: Tue 20 Jul 2010, 09:34
Location: Republic of Novo Zelande

#2 Post by greengeek »

UPDATED November 03 2013

voxmain002

(update from version 1 to correct a couple of incorrecct codes and add some extra commands)

I am trialling a prototype voice menu which has two vocabulary modes:
"simple" for use in a quiet room where commands can be easily heard
"complex" for use in a noisy room (this requires an extra prefix to be spoken before each command)

The .pet is self contained and is designed to function correctly on a freshly booted liveCD session - ie: without even having a savefile available. (I've done it this way so that people can trial it without feeling they are risking their savefiles or needing to go looking for prerequisites...)

How to get started :
Just plug in your headphones, install the .pet, and RESTART X SERVER. Then voxmain002 will autostart in complex mode (in the assumption of a noisy room). You switch between modes by speaking each of the following commands:

Code: Select all

voice-control-activate-complex
voice-control-activate-simple
Available here:
http://www.mediafire.com/download/e4p9d ... ain002.pet

simple mode command set as follows:

Code: Select all

this-is-a-test-of-voice-detection
start-browser
start-word-processor
start-music-random
start-paint
start-gimp
start-genie
start-H-top
start-file-manager
menu-file
menu-file-open
menu-file-new
menu-file-save
menu-file-save-as
menu-file-quit
menu-file-print
menu-edit
menu-edit-undo
menu-edit-find
cursor-left-one
cursor-right-one
cursor-up-one
cursor-down-one
tab-key
enter-key
backspace-key
delete-key
speak-menu
window-alt-tab
speak-window-title
maximise-window
minimise-window
close-window
select-all
copy-to-clipboard
paste-from-clipboard
take-screenshot
volume-max
volume-three-quarters
volume-half
volume-one-quarter
volume-low
volume-mute
click
rightclick
centerclick
doubleclick
clickhold
release
shutdown-computer
reboot-computer
restart-x-server
voice-control-activate-complex
voice-control-activate-simple
voice-control-branch
voice-control-return-to-main

complex mode command set same as above but with "voice-control-" prefix. eg:

Code: Select all

"voice-control-start-browser"
The .pet also loads a couple of .txt files into /root - these contain a list the command set as a reference document to help the user get started. Handy to have them open onscreen during testing..
.
Attachments
voxmain002_pet_HowTo.txt.falsegz.gz
Remove false gz's. It is just a txt file
(1.54 KiB) Downloaded 400 times
voxmain_pet_HowTo.txt.falsegz.gz
Remove false gz's. It is just a txt file
(1.54 KiB) Downloaded 411 times
Last edited by greengeek on Sat 02 Nov 2013, 20:17, edited 8 times in total.

User avatar
greengeek
Posts: 5789
Joined: Tue 20 Jul 2010, 09:34
Location: Republic of Novo Zelande

#3 Post by greengeek »

reserved

User avatar
greengeek
Posts: 5789
Joined: Tue 20 Jul 2010, 09:34
Location: Republic of Novo Zelande

#4 Post by greengeek »

reserved

User avatar
greengeek
Posts: 5789
Joined: Tue 20 Jul 2010, 09:34
Location: Republic of Novo Zelande

#5 Post by greengeek »

One of the most important things to decide is what format the post-boot menu would take.

I feel it would be best for the menu to offer limited choices (so that the user can get straight into their preferred activity) yet also opening the door to other choices aswell.

eg:
"Choose from Browser, File manager, or Main Puppy menu"

(If they chose Main Puppy Menu that then opens up a special menu that steps up/down through the normal puppy menu and reads out each step. This function might need a thread all to it's own...)

I was thinking that three choices per menu step would be an easy number for the user to remember.

User avatar
greengeek
Posts: 5789
Joined: Tue 20 Jul 2010, 09:34
Location: Republic of Novo Zelande

#6 Post by greengeek »

I have modified the original chatterbox files to try getting a choice of two options after booting. One is "browser" and the other is "music".

This is enough to test the basic functionality of the "if / then / else" scripts (I'm still learning here...). I'm keen for someone to test this tar.gz on a live session if poss. (i've tried it on Lupu 528 via live CD). Instructions included in the howto in the extracted tar.gz

download here:
http://www.mediafire.com/download/78za6 ... cab.tar.gz

EDIT : I forgot to mention that this version contains a very reduced vocab list in order to enhance word recognition of "browser" and "music" (as well as a very few other words). This should improve accuracy.

If no problems with this I'm keen to add a third choice of "shutdown" or "puppy menu" and start extending the functionality

User avatar
Ted Dog
Posts: 3965
Joined: Wed 14 Sep 2005, 02:35
Location: Heart of Texas

#7 Post by Ted Dog »

I've given this some thought as a starting reorder of one-switch

What would happen if we rearranged the radar and keyboard, so that radar and onscreen keyboard fit in the space on the lower left above drive icons and below trashcan. Radar and keyboard would swap sides. Radar would now hug the left most edge.
But instead of opening in radar mode start the text action sweep menu like what happens After the radar stops from the click action. But as a start mode the number menu choices would be same as keyboard rows. As it does its cycle through without action it switches into radar mode. In this new location the radar would point up to the desktop icons, and remove the LONG wait for the pointer to lineup with anything. as it sweeps clockwise. The keyboard will be directly to the right or only a qtr turn, half turn to drives and start menu.

Well that would be the way I'd do it within OneSwitch methods. It makes more sense to start in a menu mode for OneSwitch and drop into RADAR mode if none of the buttons are pressed.

User avatar
Ted Dog
Posts: 3965
Joined: Wed 14 Sep 2005, 02:35
Location: Heart of Texas

#8 Post by Ted Dog »

[ SAY ]
[ HELP ]
[ MSG]
[ MAIL]
[ WEB]

this is a multipurpose starting menu that uses the ambiguous word SAY as a direction in this case, or a Action for OneSwitch. users to switch to talking mode.

So OneSwitch users would click as the say as highlighted, then if needed help (computer to SAY help) when that was highlighted. But if user SAID help to the computer it would be an action to bring up help file selection.

its a little menu setup that works for both. the [SAY] button at the lead option is key to future uses.

User avatar
greengeek
Posts: 5789
Joined: Tue 20 Jul 2010, 09:34
Location: Republic of Novo Zelande

#9 Post by greengeek »

updated post 2 above to offer a .pet of a selfcontained voice activated menu prototype. It has what I call 'simple' and 'complex' modes to accomodate the user having to cope with background noise, or being in a quiet room and wanting speed of operation via simpler commands.

User avatar
H4LF82
Posts: 123
Joined: Tue 02 Oct 2012, 04:22

I get nothing...

#10 Post by H4LF82 »

let me preface this with the fact that I am probably doing something wrong here.

Having said that, I installed as per instructions, said the magic words...then said them again,.... and then again...and so on..

i might as well be talking to a box, as it resolutely stands firm in its refusal to "listen" to me say voice-control-do-anything.

Ill keep trying in the extremely likely event that im plugging the mic into the headphone jack or I dont have volumes set correctly or am perhaps.... mistaking a mis-shapen gourd as the mic or some other such stupidness.

let you know if i get it worked out...

Cheers!
"The wise know their weakness too well to assume infallibility; and he who knows most, knows best how little he knows." - Thomas Jefferson

User avatar
greengeek
Posts: 5789
Joined: Tue 20 Jul 2010, 09:34
Location: Republic of Novo Zelande

#11 Post by greengeek »

I have put together a puppy that integrates my new version of the voice control functionality. I call it voxpup: Thread here:
http://murga-linux.com/puppy/viewtopic.php?t=90391

I'm still also focusing on improving and extending my original scripts but just thought a full puppy was a good way of testing the functionality so far and maybe getting a wider audience for testing/feedback.

Post Reply