Brain2voice 2.0: High-Intelligibility Voice Synthesis Neural Decoder for Brain-Computer Interface

Tech ID: 34816 / UC Case 2026-909-0

Abstract

Researchers at the University of California Davis have developed a brain-computer interface technology that decodes neural signals in real-time to synthesize intelligible voice output using advanced transformer-based neural networks.

Full Description

This technology provides systems and methods to translate brain activity into synthesized speech by recording neural signals and processing them through a causal transformer encoder with multiple parallel output heads. It predicts acoustic features, acoustic tokens, and phonemes to generate real-time voice output with high fidelity and low latency. Incorporating a vocoder such as LPCNet, it efficiently converts predicted acoustic representations into natural sounding audio, enabling communication for individuals with speech loss due to neurological or other conditions. The system supports both invasive and noninvasive neural signal recording modalities and employs sophisticated training methods including multiscale discriminators and residual vector quantization tokenizers. The intelligibility of synthesized speech produced by this technology is at least 8-fold greater when compared to previous approaches.

Applications

  • Assistive communication devices for patients with paralysis or speech impairments. 
  • Advanced prosthetics control integrating voice synthesis functionality. 
  • Neurotechnology platforms for real-time human-computer interaction through thought-driven communication. 
  • Clinical solutions restoring verbal communication for individuals affected by neurological disorders. 
  • Research tools in neuroscience and speech therapy to study and rehabilitate speech through brain signals. 
  • Integration into consumer devices enhancing accessibility for users with speech limitations.

Features/Benefits

  • Synthesizes voice output within milliseconds of acquiring neural signals. 
  • Decodes neural activity with a multimodal transformer that simultaneously predicts continuous acoustic features, discrete acoustic tokens, and phonemes. 
  • Generates high-quality speech efficiently by integrating an LPCNet vocoder to keep computational complexity low. 
  • Supports multiple neural recording modalities by working with both invasive (e.g., intracortical arrays) and noninvasive (e.g., EMG) inputs. 
  • Improves decoding accuracy and stability using robust training techniques (e.g., multiscale discriminators and CTC). 
  • Maintains continuous, smooth speech output using sliding-window buffering for ongoing decoding. 
  • Preserves expressiveness beyond text-only BCIs by producing richer, more natural-sounding speech than brain-to-text systems. 
  • Restores spoken communication for people who lose speech due to conditions such as ALS, stroke, or brain injury. 
  • Eliminates the delay and flatness of brain-to-text systems by enabling immediate, expressive speech output. 
  • Converts complex neural signals into intelligible speech in real time, rather than requiring offline processing or simplified outputs.

Patent Status

Patent Pending

Contact

Learn About UC TechAlerts - Save Searches and receive new technology matches

Inventors

  • Brandman, David
  • Stavisky, Sergey
  • Wairagkar, Maitreyee

Other Information

Keywords

acoustic features, adaptive decoding, brain-computer interface, causal transformer, LPCNet, neural decoding, real-time speech synthesis, residual vector quantization, vocoder, voice restoration

Categorized As