Archive

Posts Tagged ‘Speech’

HowTo: Pause and Resume Speech Recognition with Microsoft engines

At SpeechTurtle application, I’ve just added speech feedback (voicing of a command) when an available command is executed using a mouse click on its name.

That could also help the user learn the expected pronunciation in English in case the speech recognition engine doesn’t understand some of the commands as voiced by the user. One can assume most of what the Speech Synthesis engine outputs to be recognizable by the Speech Recognition engine.

An issue with this approach though, is that the Speech Recognition can be fired accidentally by the speech synthesis commands, if the speech recognition engine doesn’t handle this case automatically, ignoring synthesized speech that is being generated in parallel by the speech engine.

In fact this can also be a security issue, with a malicious agent delivering voice commands to your system via some audio or video file/stream they lure you into listening/watching, or some web page they lure you into visiting (even if a webpage is not malicious, it might have been served and hosting a malicious ad by an ad network).

So, we need some way to pause the speech recognition while speaking, to avoid misfiring of recognition, since from my experience, the speech synthesis and recognition engines from .NET’s System.Speech namespace on recent Windows versions (tried with Windows 10) do have this issue.

In SpeechLib (that SpeechTurtle uses via the SpeechLib NuGet package), I’ve added commands Pause and Resume to the ISpeechRecognition interface (defined in SpeechLib.Models project and respective NuGet package and implemented at SpeechLib.Recognition and SpeechLib.Recognition.KinectV1 projects and NuGet packages).

So, in SpeechTurtle, I can do:

public void SpeakCommand(string command)
{   
  speechRecognition.Pause(); //pause the speech recognizer
  speechSynthesis.Speak(command);   
  speechRecognition.Resume();
}

Note the pattern used in SpeechRecognition.cs to retry 10 times to pause the speech recognition engine, since errors are thrown if one tries to Stop it or Set its audio input to none while it is trying to perform some recognition.

public void Pause()
{   
  for (int i=0; i<10; i++) //(re)try 10 times
  //(since we wait 100 ms at failure below before retrying, max wait is 1000ms=1sec)
    try
    {       
      SetInputToNone();
      return; //exit retry loop if succeeded
    }
  catch //catch and ignore any error saying that recognition is currently running
    {       
      Thread.Sleep(100); //retry in 100ms
    }
}

Update 1:

After more testing, it seems the above approach with the loop and try/catch won’t work

if one uses the async versions of Speech Recognition methods, since the exceptions are thrown from another thread. In that case one need to add a global exception handler.

Update 2:

After lots of trial and error, I ended up with this working pattern for Pause and Resume in SpeechLib’s SpeechRecognition.cs (note that paused is a bool(ean) field of that class, defaulting to false and PAUSE_LOOP_SLEEP is a const(ant) int(eger) set to 10 (msec):

public void Pause()
{   
  paused = true;
  speechRecognitionEngine.RequestRecognizerUpdate();
}
 
public void Resume()
{   
  paused = false;
}

At the constructor of that SpeechRecognition class I do:

  speechRecognitionEngine.RecognizerUpdateReached +=
(s, e) => {
while (paused) Thread
.Sleep(PAUSE_LOOP_SLEEP); };

I do a loop at RecognizerUpdateReached event handler to make sure the Speech Recognition

thread is waiting for the pause field to change value back to false. That event occurs after the call to RequestRecognizerUpdate in Pause method (which is done after first setting paused=true there).

HowTo: List all known color names and find name of given color at WPF

This is my answer at
http://stackoverflow.com/questions/4475391/wpf-silverlight-find-the-name-of-a-color

Modified answer from Thomas Levesque to populate the Dictionary only when 1st needed, instead of taking the cost at startup (going to use at speech recognition-driven turtle graphics, so that user can pronounce known color names to change the turtle’s pen color)

  1. //Project: SpeechTurtle (http://SpeechTurtle.codeplex.com)
  2. //Filename: ColorUtils.cs
  3. //Version: 20150901
  4.  
  5. using System.Collections.Generic;
  6. using System.Linq;
  7. using System.Reflection;
  8. using System.Windows.Media;
  9.  
  10. namespace SpeechTurtle.Utils
  11. {
  12.   /// <summary>
  13.   /// Color-related utility methods
  14.   /// </summary>
  15.   public static class ColorUtils //based on http://stackoverflow.com/questions/4475391/wpf-silverlight-find-the-name-of-a-color
  16.   {
  17.     #region — Fields —
  18.  
  19.     private static Dictionary<string, Color> knownColors; //=null
  20.  
  21.     #endregion
  22.  
  23.     #region — Methods —
  24.  
  25.     #region Extension methods
  26.  
  27.     public static string GetKnownColorName(this Color color)
  28.     {
  29.       return GetKnownColors()
  30.           .Where(kvp => kvp.Value.Equals(color))
  31.           .Select(kvp => kvp.Key)
  32.           .FirstOrDefault();
  33.     }
  34.  
  35.     public static Color GetKnownColor(this string name)
  36.     {
  37.       Color color;
  38.       return GetKnownColors().TryGetValue(name, out color) ? color : Colors.Black; //if color for name is not found, return black
  39.     }
  40.  
  41.     #endregion
  42.  
  43.     public static Dictionary<string, Color> GetKnownColors()
  44.     {
  45.       if (knownColors == null)
  46.       {
  47.         var colorProperties = typeof(Colors).GetProperties(BindingFlags.Static | BindingFlags.Public);
  48.         knownColors = colorProperties.ToDictionary(
  49.           p => p.Name,
  50.           p => (Color)p.GetValue(null, null));
  51.       }
  52.       return knownColors;
  53.     }
  54.  
  55.     public static string[] GetKnownColorNames()
  56.     {
  57.       return GetKnownColors().Keys.ToArray();
  58.     }
  59.  
  60.     #endregion
  61.   }
  62. }

Managed .NET Speech API links

(this is my answer at http://stackoverflow.com/questions/14771474/voice-recognition-in-windows)

I’m looking into adding speech recognition to my fork of Hotspotizer Kinect-based app (http://github.com/birbilis/hotspotizer)

After some search I see you can’t markup the actionable UI elements with related speech commands in order to simulate user actions on them as one would expect if Speech input was integrated in WPF. I’m thinking of making a XAML markup extension to do that, unless someone can point to pre-existing work on this that I could reuse…

The following links should be useful:

http://www.wpf-tutorial.com/audio-video/speech-recognition-making-wpf-listen/

http://www.c-sharpcorner.com/uploadfile/mahesh/programming-speech-in-wpf-speech-recognition/

http://blogs.msdn.com/b/rlucero/archive/2012/01/17/speech-recognition-exploring-grammar-based-recognition.aspx

https://msdn.microsoft.com/en-us/library/hh855387.aspx (make use of Kinect mic array audio input)

http://kin-educate.blogspot.gr/2012/06/speech-recognition-for-kinect-easy-way.html

https://channel9.msdn.com/Series/KinectQuickstart/Audio-Fundamentals

https://msdn.microsoft.com/en-us/library/hh855359.aspx?f=255&MSPPError=-2147217396#Software_Requirements

https://www.microsoft.com/en-us/download/details.aspx?id=27225

https://www.microsoft.com/en-us/download/details.aspx?id=27226

http://www.redmondpie.com/speech-recognition-in-a-c-wpf-application/

http://www.codeproject.com/Articles/55383/A-WPF-Voice-Commanded-Database-Management-Applicat

http://www.codeproject.com/Articles/483347/Speech-recognition-speech-to-text-text-to-speech-a

http://www.c-sharpcorner.com/uploadfile/nipuntomar/speech-to-text-in-wpf/

http://www.w3.org/TR/speech-grammar/

https://msdn.microsoft.com/en-us/library/hh361625(v=office.14).aspx

https://msdn.microsoft.com/en-us/library/hh323806.aspx

https://msdn.microsoft.com/en-us/library/system.speech.recognition.speechrecognitionengine.requestrecognizerupdate.aspx

http://blogs.msdn.com/b/rlucero/archive/2012/02/03/speech-recognition-using-multiple-grammars-to-improve-recognition.aspx

%d bloggers like this: