Puppy Linux Discussion Forum Forum Index Puppy Linux Discussion Forum
Puppy HOME page : puppylinux.com
"THE" alternative forum : puppylinux.info
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

The time now is Wed 01 Oct 2014, 21:01
All times are UTC - 4
 Forum index » Advanced Topics » Hardware » Audio
Speech Enabled Puppy
Post_new_topic   Reply_to_topic View_previous_topic :: View_next_topic
Page 1 of 1 Posts_count  
Author Message
gcmartin

Joined: 14 Oct 2005
Posts: 4297
Location: Earth

PostPosted: Thu 24 May 2012, 18:41    Post_subject:  Speech Enabled Puppy  

In desktop technology, there are 2 desktop speech methods used. Desktop navigation and Office information insertion (i.e. Wordprocessing, and other pakages)

Can we do this in Puppy? How?

Thanks in advance.

_________________
Get ACTIVE Create Circles; Do those good things which benefit people's needs!
We are all related ... Its time to show that we know this!
3 Different Puppy Search Engine or use DogPile
Back to top
View user's profile Send_private_message 
darkcity


Joined: 23 May 2010
Posts: 2455
Location: near here

PostPosted: Sun 03 Jun 2012, 05:25    Post_subject:  

Are you wanting something to allow partially sighted people to use puppy?

There was speakpup

speakpup thread
http://www.murga-linux.com/puppy/viewtopic.php?t=24571

speakpup blog
http://speakpup.blogspot.co.uk/

maybe Emacspeak is what your looking for?
http://emacspeak.sourceforge.net/

_________________
helping Wiki for help | IF SendSpace link = "dead" THEN PM me ("up file to http://meownplanet.net/")
Back to top
View user's profile Send_private_message Visit_website 
disciple

Joined: 20 May 2006
Posts: 6439
Location: Auckland, New Zealand

PostPosted: Sun 03 Jun 2012, 10:27    Post_subject:  

I think he's asking about speach recognition, whereas I think speakpup was about the opposite (text to speech).

I believe there are a few options for text to speech, especially whatever you call things like jwmspeech.
But for speech recognition there aren't so many options. There is CMU Sphinx, but there don't seem to be a lot of end user applications are based on it (check out pocketsphinx though). Then there is Simon, which is KDE based. And there's another engine called Julius, but I can't see any end-user applications based on it (although I haven't looked very hard).

_________________
DEATH TO SPREADSHEETS
- - -
Classic Puppy quotes
- - -
Beware the demented serfers!
Back to top
View user's profile Send_private_message 
amigo

Joined: 02 Apr 2007
Posts: 2252

PostPosted: Sun 03 Jun 2012, 12:19    Post_subject:  

AFAIK, Simon uses Julius. I have done some work on TTS and STT. Festival is the top-of-line for text-to-speech, but there are several lighter alternatives which are nearly as good or just as good. Yes, speech-to-text is much more complicated. Julius seems to do a better job than sphynx, though. I use Julius for STT and either flite or svox-pico for TTS.
Back to top
View user's profile Send_private_message 
gcmartin

Joined: 14 Oct 2005
Posts: 4297
Location: Earth

PostPosted: Mon 04 Jun 2012, 16:26    Post_subject:  

amigo wrote:
AFAIK, Simon uses Julius. I have done some work on TTS and STT. Festival is the top-of-line for text-to-speech, but there are several lighter alternatives which are nearly as good or just as good. Yes, speech-to-text is much more complicated. Julius seems to do a better job than sphynx, though. I use Julius for STT and either flite or svox-pico for TTS.
Thanks everyone. Yes I am referring to Speech to Text (STT) and also to "Speech to control desktop actions". (Yes, I recognize that these though similar, interact with the system/subsystems differently).

@Amigo, would you do 2 things for us
  • share how you are doing STT in Puppy/Linux?
  • Is your method available as a PET for Puppy?
And, does anyone know of any other Open Source efforts in the area for speech in Linux.

Thanks in advance

_________________
Get ACTIVE Create Circles; Do those good things which benefit people's needs!
We are all related ... Its time to show that we know this!
3 Different Puppy Search Engine or use DogPile
Back to top
View user's profile Send_private_message 
ETP


Joined: 19 Oct 2010
Posts: 542
Location: UK

PostPosted: Tue 20 Aug 2013, 07:20    Post_subject: Speech Enabled Puppy  

gcmartin wrote:
amigo wrote:
AFAIK, Simon uses Julius. I have done some work on TTS and STT. Festival is the top-of-line for text-to-speech, but there are several lighter alternatives which are nearly as good or just as good. Yes, speech-to-text is much more complicated. Julius seems to do a better job than sphynx, though. I use Julius for STT and either flite or svox-pico for TTS.
Thanks everyone. Yes I am referring to Speech to Text (STT) and also to "Speech to control desktop actions". (Yes, I recognize that these though similar, interact with the system/subsystems differently).

@Amigo, would you do 2 things for us
  • share how you are doing STT in Puppy/Linux?
  • Is your method available as a PET for Puppy?
And, does anyone know of any other Open Source efforts in the area for speech in Linux.

Thanks in advance


You may be interested in the following. It is primitive but at least its a start.
www.murga-linux.com/puppy/viewtopic.php?p=719680#719680

_________________
Regards ETP
Accessibility Pups: -- Magoo -- The Pup With No Name -- MouseCam -- Obedient
Back to top
View user's profile Send_private_message 
amigo

Joined: 02 Apr 2007
Posts: 2252

PostPosted: Tue 20 Aug 2013, 11:38    Post_subject:  

Sorry I hadn't moticed new posts in this thread. Text-to-speech is really easy by using flite, but the male voices aren't very nice. svox-pico has nicer voices. I don't create pets myself, but you can use my src2pkg tool to create them.
Edited_time_total
Back to top
View user's profile Send_private_message 
gcmartin

Joined: 14 Oct 2005
Posts: 4297
Location: Earth

PostPosted: Tue 20 Aug 2013, 19:22    Post_subject: A major subsystem service is being made avilaible to PUPPY  

It looks like the post provided by @ETP is a major PUPPY contribution for directing Puppy to achieve work for us, vocally. THIS IS ENORMOUSLY IMPORTANT as any of us can make use of its technology to productively achieve useful work.

If you have a moment, review the post he provides us, in his earlier post.

_________________
Get ACTIVE Create Circles; Do those good things which benefit people's needs!
We are all related ... Its time to show that we know this!
3 Different Puppy Search Engine or use DogPile
Back to top
View user's profile Send_private_message 
disciple

Joined: 20 May 2006
Posts: 6439
Location: Auckland, New Zealand

PostPosted: Wed 21 Aug 2013, 02:56    Post_subject:  

amigo wrote:
Sorry I hadn't moticed new posts in this thread. Speech-to-text is really easy by using flite, but the male voices aren't very nice. svox-pico has nicer voices. I don't create pets myself, but you can use my src2pkg tool to create them.

Flite is text-to-speech. How do you do speech to text? You said you use Julius - is it an end-user application, not just a library? I think maybe I just didn't build it when I was looking into this stuff previously because it sounded like the English acoustic models weren't complete enough.

_________________
DEATH TO SPREADSHEETS
- - -
Classic Puppy quotes
- - -
Beware the demented serfers!
Back to top
View user's profile Send_private_message 
disciple

Joined: 20 May 2006
Posts: 6439
Location: Auckland, New Zealand

PostPosted: Wed 21 Aug 2013, 03:14    Post_subject:  

Quote:
And, does anyone know of any other Open Source efforts in the area for speech in Linux.

I forget the name, but there was a really old project which was used in some of the tablet distros which turned into things like Meego. But when I looked it seemed like all the copies of the website and source had disappeared Sad

Also, FWIW:
- Simon now supports 3 backends, CMU SPHINX, HTK (proprietary) and Julius.
- the Simon developer has recently been working on "a demo of open source speech recognition", including improving the acoustic and language model. See http://grasch.net/blog

_________________
DEATH TO SPREADSHEETS
- - -
Classic Puppy quotes
- - -
Beware the demented serfers!
Back to top
View user's profile Send_private_message 
amigo

Joined: 02 Apr 2007
Posts: 2252

PostPosted: Wed 21 Aug 2013, 06:36    Post_subject:  

Sorry disciple -I must have just gotten out of bed or was needing to get in bed...

For Speech-to-text I use julius as it does a better job than sphynx.
I'll have to find my source tree for you because I created a small patch to the sources to quieten the julius output for easier parsing...

OK, found them. There is a julius program that can be built under the /julius directory. But, /julius-simple contains a more concise demo application which I modifed with the following patch:
Code:
--- ./julius-simple.c.00   2009-04-11 11:51:20.000000000 +0000
+++ ./julius-simple.c   2010-11-08 09:00:15.000000000 +0000
@@ -29,6 +29,7 @@
 
 /* include top Julius library header */
 #include <julius/juliuslib.h>
+#define ACCEPTANCE_LEVEL .700
 
 /**
  * Callback to be called when start waiting speech input.
@@ -39,6 +40,7 @@
 {
   if (recog->jconf->input.speech_input == SP_MIC || recog->jconf->input.speech_input == SP_NETAUDIO) {
     fprintf(stderr, "<<< please speak >>>");
+   system("flite 'okay '");
   }
 }
 
@@ -91,6 +93,9 @@
   WORD_ID *seq;
   int seqnum;
   int n;
+  double total;
+  /* double alevel = 0.700; */
+  double alevel = ACCEPTANCE_LEVEL;
   Sentence *s;
   RecogProcess *r;
   HMM_Logical *p;
@@ -134,98 +139,41 @@
 
     /* output results for all the obtained sentences */
     winfo = r->lm->winfo;
-
+   
     for(n = 0; n < r->result.sentnum; n++) { /* for all sentences */
 
       s = &(r->result.sent[n]);
       seq = s->word;
       seqnum = s->word_num;
-
-      /* output word sequence like Julius */
-      printf("sentence%d:", n+1);
-      for(i=0;i<seqnum;i++) printf(" %s", winfo->woutput[seq[i]]);
-      printf("\n");
-      /* LM entry sequence */
-      printf("wseq%d:", n+1);
-      for(i=0;i<seqnum;i++) printf(" %s", winfo->wname[seq[i]]);
-      printf("\n");
-      /* phoneme sequence */
-      printf("phseq%d:", n+1);
-      put_hypo_phoneme(seq, seqnum, winfo);
-      printf("\n");
-      /* confidence scores */
-      printf("cmscore%d:", n+1);
-      for (i=0;i<seqnum; i++) printf(" %5.3f", s->confidence[i]);
-      printf("\n");
-      /* AM and LM scores */
-      printf("score%d: %f", n+1, s->score);
-      if (r->lmtype == LM_PROB) { /* if this process uses N-gram */
-   printf(" (AM: %f  LM: %f)", s->score_am, s->score_lm);
-      }
-      printf("\n");
-      if (r->lmtype == LM_DFA) { /* if this process uses DFA grammar */
-   /* output which grammar the hypothesis belongs to
-      when using multiple grammars */
-   if (multigram_get_all_num(r->lm) > 1) {
-     printf("grammar%d: %d\n", n+1, s->gram_id);
-   }
-      }
       
-      /* output alignment result if exist */
-      for (align = s->align; align; align = align->next) {
-   printf("=== begin forced alignment ===\n");
-   switch(align->unittype) {
-   case PER_WORD:
-     printf("-- word alignment --\n"); break;
-   case PER_PHONEME:
-     printf("-- phoneme alignment --\n"); break;
-   case PER_STATE:
-     printf("-- state alignment --\n"); break;
-   }
-   printf(" id: from  to    n_score    unit\n");
-   printf(" ----------------------------------------\n");
-   for(i=0;i<align->num;i++) {
-     printf("[%4d %4d]  %f  ", align->begin_frame[i], align->end_frame[i], align->avgscore[i]);
-     switch(align->unittype) {
-     case PER_WORD:
-       printf("%s\t[%s]\n", winfo->wname[align->w[i]], winfo->woutput[align->w[i]]);
-       break;
-     case PER_PHONEME:
-       p = align->ph[i];
-       if (p->is_pseudo) {
-         printf("{%s}\n", p->name);
-       } else if (strmatch(p->name, p->body.defined->name)) {
-         printf("%s\n", p->name);
-       } else {
-         printf("%s[%s]\n", p->name, p->body.defined->name);
-       }
-       break;
-     case PER_STATE:
-       p = align->ph[i];
-       if (p->is_pseudo) {
-         printf("{%s}", p->name);
-       } else if (strmatch(p->name, p->body.defined->name)) {
-         printf("%s", p->name);
-       } else {
-         printf("%s[%s]", p->name, p->body.defined->name);
-       }
-       if (r->am->hmminfo->multipath) {
-         if (align->is_iwsp[i]) {
-      printf(" #%d (sp)\n", align->loc[i]);
-         } else {
-      printf(" #%d\n", align->loc[i]);
-         }
-       } else {
-         printf(" #%d\n", align->loc[i]);
-       }
-       break;
+      /* add the confidence scores */
+     printf("cmscore%d:", n+1);
+      for (i=0;i<seqnum; i++) {
+        printf(" %5.3f", s->confidence[i]);
+        total = (total + s->confidence[i]);
      }
-   }
-   
-   printf("re-computed AM score: %f\n", align->allscore);
-
-   printf("=== end forced alignment ===\n");
-      }
+     printf("\n");
+    
+     printf("Sentence%d(%5.3f)(%i)(%5.3f)(%5.3f):", n+1, (total-2), (seqnum-2), (total-2) / (seqnum-2), alevel);
+     printf("\n");
+     /* ignore first and last (always perfect) scores and subtract them from seqnum */
+     if ((total-2) / (seqnum-2) > alevel) {
+        /* for(i=0;i<seqnum;i++) printf(" %s", winfo->woutput[seq[i]]); */
+        /* ignore the first and last outputs (<s> and </s>) */
+        for(i=1;i<seqnum-1;i++) {
+           if (s->confidence[i] > alevel) {
+            printf(" %s", winfo->woutput[seq[i]]);
+           } else {
+             /* if the score for this word is lower than alevel, output a missed-word indicator*/
+            printf(" !!??");
+           }
+        }
+     } else {
+      /* if average of all scores is less than alevel, output a warning text */
+      printf("UNRECOGNIZED UTTERANCE");
+      system("flite 'UNRECOGNIZED UTTERANCE'");
+     }
+     printf("\n");
     }
   }
 
@@ -263,7 +211,7 @@
 
   /* by default, all messages will be output to standard out */
   /* to disable output, uncomment below */
-  //jlog_set_output(NULL);
+  jlog_set_output(NULL);
 
   /* output log to a file */
   //FILE *fp; fp = fopen("log.txt", "w"); jlog_set_output(fp);


Those changes were still tentative, but make the output lots more concise.
Back to top
View user's profile Send_private_message 
Display_posts:   Sort by:   
Page 1 of 1 Posts_count  
Post_new_topic   Reply_to_topic View_previous_topic :: View_next_topic
 Forum index » Advanced Topics » Hardware » Audio
Jump to:  

Rules_post_cannot
Rules_reply_cannot
Rules_edit_cannot
Rules_delete_cannot
Rules_vote_cannot
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001, 2005 phpBB Group
[ Time: 0.0992s ][ Queries: 11 (0.0033s) ][ GZIP on ]