Gamified Speech Therapy System and Methods

Tech ID: 34358 / UC Case 2020-274-0

Background

Historically, speech therapy apps have relied primarily on online cloud-based speech recognition systems like those used in digital assistants (Cortana, Siri, Google Assistant), which are designed to best guess speech rather than critically evaluate articulation errors. For children with cleft palate specifically, affecting 1 in 700 babies globally, speech therapy is essential follow-up care after reconstructive surgery. Approximately 25% of children with clefts use compensatory articulation errors, and when these patterns become habituated during ages 3-5, they become particularly resistant to change in therapy. Traditional approaches to mobile speech therapy apps have included storybook-style narratives that proved expensive with low replayability and engagement, as well as fast-paced arcade-style games that failed to maintain user interest. Common speech therapy applications require a facilitator to evaluate speech performance and typically depend on continuous internet connectivity, creating barriers for users in areas with poor network coverage or those concerned about data privacy and roaming costs. The shift toward gamified therapy solutions showed that game elements can serve as powerful motivators for otherwise tedious activities. Speech recognition systems face inherent limitations in accuracy compared to cloud-based solutions and require substantial processing power and memory that can impact device performance and battery life, particularly on older mobile devices. Automatic speech recognition (ASR) models struggle significantly with children's speech due to non-fluent pronunciation and variability in speech patterns, with phoneme error rates reaching almost 12%, and consonant recognition errors affecting the reliability of speech disorder detection. The challenge becomes even more pronounced for populations with speech impairments, as conventional ASR systems are optimized for typical adult speech rather than atypical articulation patterns of cleft palate speech or developmental disabilities. Moreover, maintaining user engagement over extended therapy periods is hard, and many apps fail to provide sufficient motivation for daily practice, which is essential for speech improvement.

Technology Description

To help address these challenges in speech therapy applications, researchers at UC Santa Cruz (UCSC) have developed a hybrid online/offline dual processing architecture that enables the mobile device to seamlessly toggle between online and offline speech recognition modes in real-time based on network connectivity. The system employs customizable ARPAbet phonetic transcription that maps to correct pronunciations and common mispronunciations to create "pseudo-words" representing incorrect speech patterns. The machine learning-based acoustic model can be trained concurrently during operation using verbal samples across multiple spoken languages, with the ability to detect speaker age (adult vs. child), speech accents, resonance errors (hypernasality/hyponasality), and specific articulation errors characteristic of cleft palate speech. This concurrent training capability integrates a perpetual cyclical narrative game structure with custom lip-sync animations that visually demonstrate correct speech placement to children, overcoming issues with overly complex storybook narratives with low replayability and arcade-style games lacking sustained engagement. The system also implements adaptive challenge adjustment based on performance and allows third-party override capabilities, enabling medical professionals to manually grade speech or create custom curricula while still leveraging the motivational game context.

Applications

cleft palate care
speech rehabilitation
telepractice, remote therapy

Features/Benefits

Represents the first mobile speech therapy application capable of critical speech diagnosis both with and without network connection.
Customizable ARPAbet-based critical speech recognition enables true diagnostic capability.
Provides continuously improving accuracy tailored to specific user populations without requiring manual model updates.
Adapts to diverse accents and languages, and enables detection of hyponasality/hypernasality.
Allows speech-language pathologists to manually grade speech, adjust automated thresholds, and create custom therapy targets while leveraging the motivational game context.

Intellectual Property Information

Country	Type	Number	Dated	Case
United States Of America	Issued Patent	12,277,933	04/15/2025	2020-274

Related Materials

Contact

Learn About UC TechAlerts - Save Searches and receive new technology matches

Inventors

Kurniawan, Sri

Other Information

Keywords

speech, speech therapy, speech recognition, automatic speech recognition, speech impairments, mobile serious game, serious game, speech articulation therapy, speech articulation, SpokeIt, mobile speech therapy

Categorized As

Additional Technologies by these Inventors

Telehealth-Mediated Physical Rehabilitation Systems and Methods