| Country | Type | Number | Dated | Case |
| United States Of America | Published Application | 20250246187 | 07/31/2025 | 2024-062 |
Reduced Data Requirements: The H-UDM approach significantly lowers the need for expensive, human-labeled datasets by utilizing its hierarchical structure to infer disfluencies. Dual-Task Efficiency: Unlike previous models that only transcribe or only detect, this framework performs both tasks simultaneously, ensuring the transcription is contextually aware of the detected errors. Systematic AI Framework: Provides a formalized, scalable method for modeling disfluent speech where previous solutions were fragmented or ineffective. High Reliability: Experimental results demonstrate that the hierarchical approach is more robust across different types of disfluency compared to standard linear speech models. Seamless Integration: Designed to be compatible with existing automated speech recognition (ASR) pipelines, making it a viable "drop-in" enhancement for current software.
Digital Speech Therapy: Powering remote health platforms that provide real-time feedback and progress tracking for individuals with speech disorders or stutters.
Language Learning (ESL): Enhancing educational software by identifying specific fluency gaps and disfluency patterns in non-native speakers to guide personalized practice.
Inclusive Voice User Interfaces (VUI): Improving the reliability of smart assistants (like Alexa or Siri) for users with diverse speech patterns or neurological conditions.
Automated Transcription Services: Increasing the readability of raw interview transcripts or court proceedings by accurately detecting and tagging "um," "ah," and repeated phrases.
Clinical Diagnostics: Assisting medical professionals in the early detection of cognitive decline or neurological disorders through the automated analysis of speech fluency.