Software Development Kits

DynaSpeakDynaSpeak is a small footprint, high accuracy speaker independent speech recognition engine that scales from embedded to large scale system use in industrial, consumer, and military products and systems. The DynaSpeak engine can be ported to a variety of processor/operating system configurations, giving flexibility in product design. DynaSpeak supports both finite state grammars - used in more traditional command and control style applications - and statistical language models - used in more advanced natural language style dialog applications. Because DynaSpeak has been developed for field-oriented embedded applications, it incorporates SRI-developed patented techniques that increase recognition performance using speaker adaptation, microphone adaptation, end of speech detection, distributed speech recognition, and noise robustness.

Features & Benefits 

  • Hidden Markov Model (HMM)-based speech recognizer - State of the art accuracy
  • Continuous speech - No need for pauses, user speaks naturally
  • Dynamic grammar compilation - Enables complex application workflows in a small footprint speech recognizer
  • Speaker independent - No tedious user training session required
  • Speaker adaptation - Automatically adjusts to different speakers and accents
  • C++ implementation - Portable to a range of hardware/software configurations
  • Noise filtering design tools - Rapid tuning for noisy acoustic environments without time-consuming and expensive acoustic model development
  • Dynamic noise compensation - Realtime differentiation between background noise and speaker
  • Floating point or integer versions - Wide choice of hardware options
  • Supports finite state (command and control) or statistical (free form) grammars - More flexible, natural application designs
  • Supports push-to-talk, hold-to-talk, and open mic recording - Multiple user interface options
  • Distributed speech recognition over low bandwidth networks - Low cost, high accuracy deployment option for speech recognition on mobile devices

Technical Specifications 

CPU requirements:
200 MHz StrongArm, 66 MHz Intel x86 (support for other processors on request)

Memory Requirements:
Total: 750KB-2.250MB
Executable (ROM): 350-750KB
Acoustic models (ROM): 100-500KB
Active search (RAM): 300KB-1MB (more for complex grammars)

Supported Languages:
Adults: American and British English, Latin American Spanish, Iraqi Arabic, Pashto and Dari (others on request)
Children: American English

Statistical or JSGF forms; static or dynamic; dictation option

Operating Systems:
Windows, Mac OS X, Linux and Android (others on request)

Development Environment:
C/C++, Java (via JNI), client/server versions available

back to top