Chatterbox - STT / TTS / TTA project. Part 2
Also, I found some words worked really well and others were unreliable (this probably depends on the microphone, the soundcard and the voice of the user etc)
Here is a list of the words I found that work pretty consistently so far:
negative (pronounce the t clearly)
affirmative (pronounce the t clearly and roll the r slightly as Americans do)
yes
no
right
down
north (roll the r slightly as americans do)
program
clear
again (pronounce "agen" not "agayn")
welcome
beginning
screen
return (roll the r slightly as americans do)
absolutely
music
internet (pronounced as "innnternet" as Americans would. Roll the r slightly)
one
four (roll the r slightly as americans do)
six
self
finish
fiction
america
Out house
Avoid start and stop as they are too easily confused.
.
.
Here is a list of the words I found that work pretty consistently so far:
negative (pronounce the t clearly)
affirmative (pronounce the t clearly and roll the r slightly as Americans do)
yes
no
right
down
north (roll the r slightly as americans do)
program
clear
again (pronounce "agen" not "agayn")
welcome
beginning
screen
return (roll the r slightly as americans do)
absolutely
music
internet (pronounced as "innnternet" as Americans would. Roll the r slightly)
one
four (roll the r slightly as americans do)
six
self
finish
fiction
america
Out house
Avoid start and stop as they are too easily confused.
.
.
Last edited by greengeek on Sun 13 Oct 2013, 00:39, edited 1 time in total.
Code: Select all
sh-4.1# ./pocketsphinx_continuous
INFO: cmd_ln.c(691): Parsing command line:
./pocketsphinx_continuous
Current configuration:
[NAME] [DEFLT] [VALUE]
-adcdev
-agc none none
-agcthresh 2.0 2.000000e+00
-alpha 0.97 9.700000e-01
-argfile
-ascale 20.0 2.000000e+01
-aw 1 1
-backtrace no no
-beam 1e-48 1.000000e-48
-bestpath yes yes
-bestpathlw 9.5 9.500000e+00
-bghist no no
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-compallsen no no
-debug 0
-dict
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillprob 1e-8 1.000000e-08
-frate 100 100
-fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes yes
-fwdflatbeam 1e-64 1.000000e-64
-fwdflatefwid 4 4
-fwdflatlw 8.5 8.500000e+00
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7.000000e-29
-fwdtree yes yes
-hmm
-infile
-input_endian little little
-jsgf
-kdmaxbbi -1 -1
-kdmaxdepth 0 0
-kdtree
-latsize 5000 5000
-lda
-ldadim 0 0
-lextreedump 0 0
-lifter 0 0
-lm
-lmctl
-lmname default default
-logbase 1.0001 1.000100e+00
-logfn
-logspec no no
-lowerf 133.33334 1.333333e+02
-lpbeam 1e-40 1.000000e-40
-lponlybeam 7e-29 7.000000e-29
-lw 6.5 6.500000e+00
-maxhmmpf -1 -1
-maxnewoov 20 20
-maxwpf -1 -1
-mdef
-mean
-mfclogdir
-min_endfr 0 0
-mixw
-mixwfloor 0.0000001 1.000000e-07
-mllr
-mmap yes yes
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-nwpen 1.0 1.000000e+00
-pbeam 1e-48 1.000000e-48
-pip 1.0 1.000000e+00
-pl_beam 1e-10 1.000000e-10
-pl_pbeam 1e-5 1.000000e-05
-pl_window 0 0
-rawlogdir
-remove_dc no no
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-sendump
-senlogdir
-senmgau
-silprob 0.005 5.000000e-03
-smoothspec no no
-svspec
-time no no
-tmat
-tmatfloor 0.0001 1.000000e-04
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 6.855498e+03
-usewdphones no no
-uw 1.0 1.000000e+00
-var
-varfloor 0.0001 1.000000e-04
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 7.000000e-29
-wip 0.65 6.500000e-01
-wlen 0.025625 2.562500e-02
INFO: cmd_ln.c(691): Parsing command line:
\
-nfilt 20 \
-lowerf 1 \
-upperf 4000 \
-wlen 0.025 \
-transform dct \
-round_filters no \
-remove_dc yes \
-svspec 0-12/13-25/26-38 \
-feat 1s_c_d_dd \
-agc none \
-cmn current \
-cmninit 56,-3,1 \
-varnorm no
Current configuration:
[NAME] [DEFLT] [VALUE]
-agc none none
-agcthresh 2.0 2.000000e+00
-alpha 0.97 9.700000e-01
-ceplen 13 13
-cmn current current
-cmninit 8.0 56,-3,1
-dither no no
-doublebw no no
-feat 1s_c_d_dd 1s_c_d_dd
-frate 100 100
-input_endian little little
-lda
-ldadim 0 0
-lifter 0 0
-logspec no no
-lowerf 133.33334 1.000000e+00
-ncep 13 13
-nfft 512 512
-nfilt 40 20
-remove_dc no yes
-round_filters yes no
-samprate 16000 1.600000e+04
-seed -1 -1
-smoothspec no no
-svspec 0-12/13-25/26-38
-transform legacy dct
-unit_area yes yes
-upperf 6855.4976 4.000000e+03
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wlen 0.025625 2.500000e-02
INFO: acmod.c(246): Parsed model-specific feature parameters from /usr/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/feat.params
INFO: feat.c(713): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
INFO: acmod.c(167): Using subvector specification 0-12/13-25/26-38
INFO: mdef.c(517): Reading model definition: /usr/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/mdef
INFO: mdef.c(528): Found byte-order mark BMDF, assuming this is a binary mdef file
INFO: bin_mdef.c(336): Reading binary model definition: /usr/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/mdef
INFO: bin_mdef.c(513): 50 CI-phone, 143047 CD-phone, 3 emitstate/phone, 150 CI-sen, 5150 Sen, 27135 Sen-Seq
INFO: tmat.c(205): Reading HMM transition probability matrices: /usr/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/transition_matrices
INFO: acmod.c(121): Attempting to use SCHMM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/means
INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/variances
INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(354): 0 variance values floored
INFO: s2_semi_mgau.c(903): Loading senones from dump file /usr/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/sendump
INFO: s2_semi_mgau.c(927): BEGIN FILE FORMAT DESCRIPTION
INFO: s2_semi_mgau.c(1022): Using memory-mapped I/O for senones
INFO: s2_semi_mgau.c(1296): Maximum top-N: 4 Top-N beams: 0 0 0
INFO: dict.c(317): Allocating 137543 * 20 bytes (2686 KiB) for word entries
INFO: dict.c(332): Reading main dictionary: /usr/share/pocketsphinx/model/lm/en_US/cmu07a.dic
INFO: dict.c(211): Allocated 1010 KiB for strings, 1664 KiB for phones
INFO: dict.c(335): 133436 words read
INFO: dict.c(341): Reading filler dictionary: /usr/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/noisedict
INFO: dict.c(211): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(344): 11 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(404): Allocating 50^3 * 2 bytes (244 KiB) for word-initial triphones
INFO: dict2pid.c(131): Allocated 30200 bytes (29 KiB) for word-final triphones
INFO: dict2pid.c(195): Allocated 30200 bytes (29 KiB) for single-phone word triphones
INFO: ngram_model_arpa.c(77): No \data\ mark in LM file
INFO: ngram_model_dmp.c(142): Will use memory-mapped I/O for LM file
INFO: ngram_model_dmp.c(196): ngrams 1=5001, 2=436879, 3=418286
INFO: ngram_model_dmp.c(242): 5001 = LM.unigrams(+trailer) read
INFO: ngram_model_dmp.c(288): 436879 = LM.bigrams(+trailer) read
INFO: ngram_model_dmp.c(314): 418286 = LM.trigrams read
INFO: ngram_model_dmp.c(339): 37293 = LM.prob2 entries read
INFO: ngram_model_dmp.c(359): 14370 = LM.bo_wt2 entries read
INFO: ngram_model_dmp.c(379): 36094 = LM.prob3 entries read
INFO: ngram_model_dmp.c(407): 854 = LM.tseg_base entries read
INFO: ngram_model_dmp.c(463): 5001 = ascii word strings read
INFO: ngram_search_fwdtree.c(99): 788 unique initial diphones
INFO: ngram_search_fwdtree.c(147): 0 root, 0 non-root channels, 60 single-phone words
INFO: ngram_search_fwdtree.c(186): Creating search tree
INFO: ngram_search_fwdtree.c(191): before: 0 root, 0 non-root channels, 60 single-phone words
INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 13428
INFO: ngram_search_fwdtree.c(338): after: 457 root, 13300 non-root channels, 26 single-phone words
INFO: ngram_search_fwdflat.c(156): fwdflat: min_ef_width = 4, max_sf_win = 25
INFO: continuous.c(371): ./pocketsphinx_continuous COMPILED ON: Oct 11 2013, AT: 11:34:56
Warning: Could not find Mic element
FATAL_ERROR: "continuous.c", line 254: Failed to calibrate voice activity detection
? i have 2 mics. they work and are recognized....any thoughts?
"The wise know their weakness too well to assume infallibility; and he who knows most, knows best how little he knows." - Thomas Jefferson
when i switchbetween mics, i get this...
i assume it cannot find the mic? i dunno...ill keep picking at it. no smoking HDDs tho so its progress...
Code: Select all
sh-4.1# ./pocketsphinx_continuous
INFO: cmd_ln.c(691): Parsing command line:
./pocketsphinx_continuous
Current configuration:
[NAME] [DEFLT] [VALUE]
-adcdev
-agc none none
-agcthresh 2.0 2.000000e+00
-alpha 0.97 9.700000e-01
-argfile
-ascale 20.0 2.000000e+01
-aw 1 1
-backtrace no no
-beam 1e-48 1.000000e-48
-bestpath yes yes
-bestpathlw 9.5 9.500000e+00
-bghist no no
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-compallsen no no
-debug 0
-dict
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillprob 1e-8 1.000000e-08
-frate 100 100
-fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes yes
-fwdflatbeam 1e-64 1.000000e-64
-fwdflatefwid 4 4
-fwdflatlw 8.5 8.500000e+00
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7.000000e-29
-fwdtree yes yes
-hmm
-infile
-input_endian little little
-jsgf
-kdmaxbbi -1 -1
-kdmaxdepth 0 0
-kdtree
-latsize 5000 5000
-lda
-ldadim 0 0
-lextreedump 0 0
-lifter 0 0
-lm
-lmctl
-lmname default default
-logbase 1.0001 1.000100e+00
-logfn
-logspec no no
-lowerf 133.33334 1.333333e+02
-lpbeam 1e-40 1.000000e-40
-lponlybeam 7e-29 7.000000e-29
-lw 6.5 6.500000e+00
-maxhmmpf -1 -1
-maxnewoov 20 20
-maxwpf -1 -1
-mdef
-mean
-mfclogdir
-min_endfr 0 0
-mixw
-mixwfloor 0.0000001 1.000000e-07
-mllr
-mmap yes yes
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-nwpen 1.0 1.000000e+00
-pbeam 1e-48 1.000000e-48
-pip 1.0 1.000000e+00
-pl_beam 1e-10 1.000000e-10
-pl_pbeam 1e-5 1.000000e-05
-pl_window 0 0
-rawlogdir
-remove_dc no no
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-sendump
-senlogdir
-senmgau
-silprob 0.005 5.000000e-03
-smoothspec no no
-svspec
-time no no
-tmat
-tmatfloor 0.0001 1.000000e-04
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 6.855498e+03
-usewdphones no no
-uw 1.0 1.000000e+00
-var
-varfloor 0.0001 1.000000e-04
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 7.000000e-29
-wip 0.65 6.500000e-01
-wlen 0.025625 2.562500e-02
INFO: cmd_ln.c(691): Parsing command line:
\
-nfilt 20 \
-lowerf 1 \
-upperf 4000 \
-wlen 0.025 \
-transform dct \
-round_filters no \
-remove_dc yes \
-svspec 0-12/13-25/26-38 \
-feat 1s_c_d_dd \
-agc none \
-cmn current \
-cmninit 56,-3,1 \
-varnorm no
Current configuration:
[NAME] [DEFLT] [VALUE]
-agc none none
-agcthresh 2.0 2.000000e+00
-alpha 0.97 9.700000e-01
-ceplen 13 13
-cmn current current
-cmninit 8.0 56,-3,1
-dither no no
-doublebw no no
-feat 1s_c_d_dd 1s_c_d_dd
-frate 100 100
-input_endian little little
-lda
-ldadim 0 0
-lifter 0 0
-logspec no no
-lowerf 133.33334 1.000000e+00
-ncep 13 13
-nfft 512 512
-nfilt 40 20
-remove_dc no yes
-round_filters yes no
-samprate 16000 1.600000e+04
-seed -1 -1
-smoothspec no no
-svspec 0-12/13-25/26-38
-transform legacy dct
-unit_area yes yes
-upperf 6855.4976 4.000000e+03
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wlen 0.025625 2.500000e-02
INFO: acmod.c(246): Parsed model-specific feature parameters from /usr/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/feat.params
INFO: feat.c(713): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
INFO: acmod.c(167): Using subvector specification 0-12/13-25/26-38
INFO: mdef.c(517): Reading model definition: /usr/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/mdef
INFO: mdef.c(528): Found byte-order mark BMDF, assuming this is a binary mdef file
INFO: bin_mdef.c(336): Reading binary model definition: /usr/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/mdef
INFO: bin_mdef.c(513): 50 CI-phone, 143047 CD-phone, 3 emitstate/phone, 150 CI-sen, 5150 Sen, 27135 Sen-Seq
INFO: tmat.c(205): Reading HMM transition probability matrices: /usr/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/transition_matrices
INFO: acmod.c(121): Attempting to use SCHMM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/means
INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/variances
INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(354): 0 variance values floored
INFO: s2_semi_mgau.c(903): Loading senones from dump file /usr/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/sendump
INFO: s2_semi_mgau.c(927): BEGIN FILE FORMAT DESCRIPTION
INFO: s2_semi_mgau.c(1022): Using memory-mapped I/O for senones
INFO: s2_semi_mgau.c(1296): Maximum top-N: 4 Top-N beams: 0 0 0
INFO: dict.c(317): Allocating 137543 * 20 bytes (2686 KiB) for word entries
INFO: dict.c(332): Reading main dictionary: /usr/share/pocketsphinx/model/lm/en_US/cmu07a.dic
INFO: dict.c(211): Allocated 1010 KiB for strings, 1664 KiB for phones
INFO: dict.c(335): 133436 words read
INFO: dict.c(341): Reading filler dictionary: /usr/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/noisedict
INFO: dict.c(211): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(344): 11 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(404): Allocating 50^3 * 2 bytes (244 KiB) for word-initial triphones
INFO: dict2pid.c(131): Allocated 30200 bytes (29 KiB) for word-final triphones
INFO: dict2pid.c(195): Allocated 30200 bytes (29 KiB) for single-phone word triphones
INFO: ngram_model_arpa.c(77): No \data\ mark in LM file
INFO: ngram_model_dmp.c(142): Will use memory-mapped I/O for LM file
INFO: ngram_model_dmp.c(196): ngrams 1=5001, 2=436879, 3=418286
INFO: ngram_model_dmp.c(242): 5001 = LM.unigrams(+trailer) read
INFO: ngram_model_dmp.c(288): 436879 = LM.bigrams(+trailer) read
INFO: ngram_model_dmp.c(314): 418286 = LM.trigrams read
INFO: ngram_model_dmp.c(339): 37293 = LM.prob2 entries read
INFO: ngram_model_dmp.c(359): 14370 = LM.bo_wt2 entries read
INFO: ngram_model_dmp.c(379): 36094 = LM.prob3 entries read
INFO: ngram_model_dmp.c(407): 854 = LM.tseg_base entries read
INFO: ngram_model_dmp.c(463): 5001 = ascii word strings read
INFO: ngram_search_fwdtree.c(99): 788 unique initial diphones
INFO: ngram_search_fwdtree.c(147): 0 root, 0 non-root channels, 60 single-phone words
INFO: ngram_search_fwdtree.c(186): Creating search tree
INFO: ngram_search_fwdtree.c(191): before: 0 root, 0 non-root channels, 60 single-phone words
INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 13428
INFO: ngram_search_fwdtree.c(338): after: 457 root, 13300 non-root channels, 26 single-phone words
INFO: ngram_search_fwdflat.c(156): fwdflat: min_ef_width = 4, max_sf_win = 25
INFO: continuous.c(371): ./pocketsphinx_continuous COMPILED ON: Oct 11 2013, AT: 11:34:56
Warning: Could not find Mic element
READY....
"The wise know their weakness too well to assume infallibility; and he who knows most, knows best how little he knows." - Thomas Jefferson
despite the error message, it DOES seem to be listening!
NICE JOB!
give me a few to play with this and see what I cant make of it Looks like part 2 may be close to done
Ill be back....
NICE JOB!
give me a few to play with this and see what I cant make of it Looks like part 2 may be close to done
Ill be back....
"The wise know their weakness too well to assume infallibility; and he who knows most, knows best how little he knows." - Thomas Jefferson
Code: Select all
#!/bin/sh
file="inputtxt"
pocketsphinx_continuous | while read LINE; do
case "$LINE" in
echo "$LINE" >> "$file"
done
i have created a script in the usr/bin folder and given it the above code to chew on, but im getting no joy as yet. ill figure it out tho...might take me a minute to nail down but ill get it.
if any other code monkey wants to jump in and tell me my syntax error i would not complain...feel free! but this is not so tough and ill untangle it sooner or later.
"The wise know their weakness too well to assume infallibility; and he who knows most, knows best how little he knows." - Thomas Jefferson
this isnt doing it either, but i think im getting closer...
Code: Select all
#!/bin/sh
cd /usr/bin
./pocketsphinx_continuous | while read LINE; do
case "$LINE" in
$LINE > test.txt
done
"The wise know their weakness too well to assume infallibility; and he who knows most, knows best how little he knows." - Thomas Jefferson
this...
...is a whole lot closer. still not right, but it does get sphinx running and makes some change to the txt file (everytime sphinx registers a word the files contents change i can tell because geany asks me to reload the more recent page) but it is still empty each time...so j still have a syntax error somewhere.
I will keep peckin away at it until i get it right.
Code: Select all
#!/bin/sh
pocketsphinx_continuous | while read LINE; do
$LINE > test.txt
done
I will keep peckin away at it until i get it right.
"The wise know their weakness too well to assume infallibility; and he who knows most, knows best how little he knows." - Thomas Jefferson
Just looking back at this code, is there an "esac" missing? (hope you don't mind my uneducated guesses.....)H4LF82 wrote:Code: Select all
#!/bin/sh cd /usr/bin ./pocketsphinx_continuous | while read LINE; do case "$LINE" in $LINE > test.txt done
'case' must close with 'esac':
Code: Select all
case $SOME in
*) : ;; #if more than one entry, then each should end with double';'
esac
I had a look at these resources:
http://unix.stackexchange.com/questions ... nto-a-file
and
http://unix.stackexchange.com/questions ... -to-stdout
and they seem to be trying to do something similar to what we are wanting. I tried following the program > /path/to/file syntax and got the following:
which DOES seem to create my /root/test.txt file and it does contain the text I want.
Couple of things to note:
1) The "READY" no longer displays in the pocketsphinx terminal (although that doesn't seem to stop it running...) and the rest of the text in the terminal does not display in thetext file (which is handy because it is only the decoded text just before the READY prompt that we want anyway...)
2) If I open the test.txt file with Geany I can keep selecting "File, Reload" to see the updated decoded words.
.
http://unix.stackexchange.com/questions ... nto-a-file
and
http://unix.stackexchange.com/questions ... -to-stdout
and they seem to be trying to do something similar to what we are wanting. I tried following the program > /path/to/file syntax and got the following:
Code: Select all
#!/bin/sh
pocketsphinx_continuous > /root/test.txt &
Couple of things to note:
1) The "READY" no longer displays in the pocketsphinx terminal (although that doesn't seem to stop it running...) and the rest of the text in the terminal does not display in thetext file (which is handy because it is only the decoded text just before the READY prompt that we want anyway...)
2) If I open the test.txt file with Geany I can keep selecting "File, Reload" to see the updated decoded words.
.
- Attachments
-
- pocketsphinx_text_trap.jpg
- (169.03 KiB) Downloaded 685 times
Just looking back at this code, is there an "esac" missing? (hope you don't mind my uneducated guesses.....)
how right you both are. im uneducated guessing here too, you understand...'case' must close with 'esac':
it seems to me that after seeiing how this...
Code: Select all
#!/bin/sh
pocketsphinx_continuous > /root/test.txt &
part.READY....
Listening...
Stopped listening, please wait....
i will start working on that next.
"The wise know their weakness too well to assume infallibility; and he who knows most, knows best how little he knows." - Thomas Jefferson
it seems like we need to use grep to search the test.txt file for our "spoken command"; someething like this....
..but of course, that code does not do it. evidently i am FANTASTIC at writing code that does not work!
any thoughts?
Code: Select all
!/bin/sh
pocketsphinx_continuous > /root/test.txt &
if grep -q "yes" /root/test.txt; then
peasymp3autoplay /path/to/Music
fi
done
any thoughts?
"The wise know their weakness too well to assume infallibility; and he who knows most, knows best how little he knows." - Thomas Jefferson
Probably hundreds of better ways of doing this, but it does have some effect:
This extracts the spoken command (assuming a single word).
It just finds a line beginning with a '0' eg 000000001:, then prints out the next field.
Contents of test.txt:
Just outputting to test.txt, there will be several words, as the file gets constantly written to, so it probably needs to be done 'on the fly'.
(thats not genuine output BTW, not had chance to test recognition properly)
The word then needs passing to something like:
Code: Select all
awk '/^0/ { print $2 }' ~/test.txt
It just finds a line beginning with a '0' eg 000000001:, then prints out the next field.
Contents of test.txt:
Code: Select all
READY....
Listening...
Stopped listening, please wait...
000000000: hello
READY....
Listening...
Stopped listening, please wait...
000000001: bye
READY....
Just outputting to test.txt, there will be several words, as the file gets constantly written to, so it probably needs to be done 'on the fly'.
(thats not genuine output BTW, not had chance to test recognition properly)
The word then needs passing to something like:
Code: Select all
case $wot_u_said in
browser)
defaultbrowser
;;
shutdown)
shutdown
;;
esac
}
I think what I would be wanting to do is this:
1) Identify the most recent instance of "READY" in the file (assuming the file may at times have pages of chatter in it...)
2) "Grab" the text that is found between that "READY" and the colon that precedes it.
3) Strip out any leading and trailing spaces.
4) Use the remaining text as our command keyword.
5) Clear the file.
I've got a bit of research to do...
1) Identify the most recent instance of "READY" in the file (assuming the file may at times have pages of chatter in it...)
2) "Grab" the text that is found between that "READY" and the colon that precedes it.
3) Strip out any leading and trailing spaces.
4) Use the remaining text as our command keyword.
5) Clear the file.
I've got a bit of research to do...
Thanks Keef, I didn't spot that before I posted. I will have a tinker with your code tonight.
EDIT : Could your awk script be changed to allow it to detect the "READY" string and then grab the data field BEFORE it? (I'm thinking that would ensure we were not trying to grab the data field before it was finished being written)
EDIT2 : Is there a risk involved in trying to have two programs accessing the same file? ie: what if sphinx is trying to write new data to the file while I am trying to use another program to clear the data. Do I need to handle the potential conflict resolution or does the system code handle that somehow by making one program wait politely? (and if so - how does it determine which program gets the priority?)
EDIT : Could your awk script be changed to allow it to detect the "READY" string and then grab the data field BEFORE it? (I'm thinking that would ensure we were not trying to grab the data field before it was finished being written)
EDIT2 : Is there a risk involved in trying to have two programs accessing the same file? ie: what if sphinx is trying to write new data to the file while I am trying to use another program to clear the data. Do I need to handle the potential conflict resolution or does the system code handle that somehow by making one program wait politely? (and if so - how does it determine which program gets the priority?)
I gave your awk code a try and it works well on a live file - I decided to change it to:
just so there was less chance of it picking up any other "0" that happens to get thrown in there (probably unnecessary but I feel it's more selective to look for the "0000" string)
So now the problem is to clear the file at the right time so that only a single instance is captured (last response only). (Or how to discard everything except the last line captured...?)
All I can do is have a good look at Grep, Awk and Sed and try to figure which is the best way to force discarding of everything except the most recent command word.
All suggestions truly welcome.
Code: Select all
awk '/^0000/ { print $2 }' ~/test.txt
So now the problem is to clear the file at the right time so that only a single instance is captured (last response only). (Or how to discard everything except the last line captured...?)
All I can do is have a good look at Grep, Awk and Sed and try to figure which is the best way to force discarding of everything except the most recent command word.
All suggestions truly welcome.
Well, I've got some steps that seem to work, but I haven't combined them into an elegant form yet. Here are the steps I have been trying just as single scripts:
This first step gets pocketsphinx_continuous to run and build a chatdump file:
This creates a raw "chatdump" file containing things like this:
Then I use Keefs awk to extract the recognised command word from each valid line as follows:
This creates a file called chat_extract.txt containing the following:
Then I extract just the final word from this file as follows:
Which extracts the final spoken command (in this case "beginning") and lists that single word in a file called chat_command.txt
Of course this is limiting commands to a single word, but that is where I want to start. (Then the TTA protocol/menu can easily reject any meaningless command and only permit a small range of actions that it is programmed for)
At least by picking out the final word in the file I don't need to panic about clearing the chatdump file regularly yet.
Don't laugh folks - at least I feel like I'm making progress. Baby steps!
.
This first step gets pocketsphinx_continuous to run and build a chatdump file:
Code: Select all
#!/bin/sh
pocketsphinx_continuous > /root/chatdump.txt &
Code: Select all
Listening...
Stopped listening, please wait...
000000001: Out house
READY....
Listening...
Stopped listening, please wait...
000000002: program
READY....
Listening...
Stopped listening, please wait...
000000003: beginning
READY....
Code: Select all
#!/bin/bash
awk '/^0000/ { print $2 }' /root/chatdump.txt > /root/chat_extract.txt &
Code: Select all
Out house
program
beginning
Code: Select all
#!/bin/bash
sed '$!d' /root/chat_extract.txt > /root/chat_command.txt &
Of course this is limiting commands to a single word, but that is where I want to start. (Then the TTA protocol/menu can easily reject any meaningless command and only permit a small range of actions that it is programmed for)
At least by picking out the final word in the file I don't need to panic about clearing the chatdump file regularly yet.
Don't laugh folks - at least I feel like I'm making progress. Baby steps!
.
Last edited by greengeek on Tue 15 Oct 2013, 01:02, edited 1 time in total.