Researchers at the University of California, Davis have developed a computer-based method to evaluate/quantify the intelligibility of speech synthesized by a brain-computer interface or other speech prosthesis.
Neurological injuries such as a stroke or amyotrophic lateral sclerosis (ALS) can result in speech disorders whereby an individual cannot properly communicate. One emerging solution for injured or speech-impaired populations is to bypass damaged parts of the nervous system using a brain-computer interface (BCI) which provides a direct communication pathway between the brain and an external device. Such a device can directly decode/interpret the individual's intended/targeted speech from brain activity normally used to move the tongue, jaw, lips, voice box, and diaphragm. Closely related types of speech prostheses may instead generate speech from non-invasively measured biosignals, for example muscle activity of retained movement after laryngectomy or for silent speech applications. A current limitation of these technologies is that there are no established metrics in place to gauge/quantify how well the speech prosthesis is functioning (e.g., is the quality of the synthesized speech intelligible/understood by an intended listener). This problem is particularly acute if there’s no ground truth target voice to compare the synthesized voice to, but rather just the text of the intended speech.
Researchers at UC Davis have developed a novel system, a deep-learning based, automated speech recognition (ASR) technique to access/evaluate how intelligible speech is. The developed ‘Artificial Intelligence (AI) Listener’ can generate a numeric score to quantify the quality and understandability of the speech generated by a BCI, other speech-producing technology, or natural (but potentially dysarthric) speech. The synthesized speech signals of interest (e.g., originating from the brain via a BCI) are input into an ASR model for analysis. The ASR generates an intelligibility metric by comparing the intended/targeted speech sequence with the synthesized output speech pattern. The system can produce a variety of intelligibility assessment metrics, ranging from wholistic phrase word error rates down to phoneme error rates and lower-level acoustic feature scores
Country | Type | Number | Dated | Case |
Patent Cooperation Treaty | Published Application | WO 2024/186818 | 09/12/2024 | 2023-555 |
Additional Patent Pending
speech, automated speech recognition, brain-computer interface, artificial intelligence, deep-learning, computerized speech, neurodegenerative disease, neurotechnology, speech intelligibility, neuroprosthesis