views:

36

answers:

3

I am using an engine based on TellMe. I have seen examples of grammars where the user can say one of a few different things that are considered the same. However, all the examples i've seen have been for in-line grammars (which dont work with the vxml engine im using). I want to know how i can change my .grxml file to do this. This is the file:

<?xml version="1.0"?>
<!-- created by Matthew Murdock. Grammars for speech rec menus -->
<grammar xmlns="http://www.w3.org/2001/06/grammar" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/06/grammar      http://www.w3.org/TR/speech-grammar/grammar.xsd" xml:lang="en" version="1.0" mode="voice" scope="dialog" tag-format="semantics/1.0.2006">
   <rule id="keep">
      <one-of>
         <item>exit</item>
         <item>exit the system</item>
         <item>another</item>
         <item>another mailbox</item>
         <item>play</item>
         <item>play back</item>                      
      </one-of>
   </rule>
</grammar>

instead of having 6 items, i want to have 3 items, each having two possible utterances. Any ideas on how i can do this?

A: 

The answers you want are in the SISR specification which provides a mechanism for attaching meaning to input paths. Rewriting your example:

<?xml version="1.0"?>
<grammar xmlns="http://www.w3.org/2001/06/grammar" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/06/grammar      http://www.w3.org/TR/speech-grammar/grammar.xsd" xml:lang="en" version="1.0" mode="voice" scope="dialog" tag-format="semantics/1.0-literals">
   <rule id="keep">
      <one-of>
       <item>
        <one-of>
         <item>exit</item>
         <item>exit the system</item>
        </one-of>
        <tag>exit</tag>
        </item>

       <item>
        <one-of>
         <item>another</item>
         <item>another mailbox</item>
        </one-of>
        <tag>another</tag>
       </item>

       <item>
        <one-of>
         <item>play</item>
         <item>play back</item>                      
        </one-of>
        <tag>play</tag>
       </item>
      </one-of>
   </rule>
</grammar>

Several things to know:

  • I chose the literal tag format (notice the tag-format attribute of the grammar element). It could have also been implemented using "semantics/1.0" and the contents of the tag would have looked like: out="exit";
  • TellMe tag-format values may need to be different, but their development guide implies they follow the standards.
  • Once you have it working, don't hesitate to create filler grammars (in SRGS speak, rules). Filler rules would be rules without any SI (no tag elements) and contain common phrases people add to responses. For example, a trailing rule that could be added at the end of your grammar:
      </one-of>
      <item repeat="0-1"><ruleref uri="#trailing"/></item>
   </rule>

   <rule id="trailing>
      <one-of>
         <item>please</item>
         <item>thank you</item>

      </one-of>
   </rule>

</grammar>

This would support more natural types of responses. This may or may not be important depending on your calling base. Filler grammars can be very large, but tend to be highly reusable. You can also add filler at the beginning of input. In rich speech applications, a the most significant gain in the tuning process involves updating the grammar to contain the actual phrases spoken by the caller versus what the developer or VUI designer thought would be spoken.

Jim Rush
Im sorry but your suggestion doesn't work. `<one-of>` cannot be the child of `<one-of>` in TellMe or in my engine. I am using Lumenvox for speech rec if that helps.
mtmurdock
You are correct. The inner <one-of> elements must be wrapped in <item> elements.
Jim Rush
A: 

I figured it out. I changed my grammar to look like this:

<?xml version="1.0"?>
<grammar xmlns="http://www.w3.org/2001/06/grammar" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/06/grammar      http://www.w3.org/TR/speech-grammar/grammar.xsd" xml:lang="en" version="1.0" mode="voice" scope="dialog" tag-format="semantics/1.0-literals">
   <rule id="keep">
      <one-of>
         <item><ruleref id="#exit"/></item>
         <item><ruleref id="#play"/></item>
      </one-of>
   </rule>
   <rule id="exit">
      <one-of>
         <item>exit</item>
         <item>exit the system</item>
      </one-of>
      <tag>out.result = "exit"</tag>
   </rule>
   <rule id="play">
      <one-of>
         <item>play</item>
         <item>play back</item>
      </one-of>
      <tag>out.result = "play"</tag>
   </rule>
</grammar>

Then, back in my script instead of basing my actions on callerInput (the variable specified in the <field> tag), i based them off of callerInput$.interpretation which holds xml containing whatever i assigned out.result to in the <tag> element of the grammar.

I guess it makes sense to base your actions on the "interpretation" and not the caller's literal input.

NOTE: Because we are working with our own vxml engine we were able to create a method for extracting the interpretation value out of the xml.

mtmurdock
A: 

A more compact form:

  <rule id="exit">
    exit <item repeat="0-1">the system</item>
    <tag>out.result = "exit"</tag>
  </rule>
  <rule id="play">
    play <item repeat="0-1">back</item>
    <tag>out.result = "play"</tag>
  </rule>
gawi