Jean-Marc Valin is a Montreal-based Senior Staff Research Scientist at Google with a Ph.D. in Electrical Engineering and 23 years of academic and industry experience applying machine learning to speech and signal processing. He specializes in designing efficient, deployable ML algorithms and hybrid DSL/ML speech codecs that prioritize real-time communication, packet-loss robustness, and acoustic cancellation. A prolific open-source contributor, his work on Speex, Opus and AV1 — including RNN-based VAD and core rnnoise denoiser components — is embedded in billions of devices. He’s known for bridging differentiable signal processing with production constraints, often applying perceptual tuning (e.g., perceptual exponent), Viterbi-enhanced VAD and targeted rate‑allocation fixes to boost real-world audio quality.
24 years of coding experience
16 years of employment as a software developer
PhD, Electrical Engineering, PhD, Electrical Engineering at Université de Sherbrooke
Contributions:6 reviews, 1001 commits, 41 PRs in 11 years 10 months
Contributions summary:Jean-marc's contributions primarily focused on improving the Opus audio codec, specifically addressing rate allocation for stereo SILK in hybrid mode. They modified the encoder to allocate more bits to the SILK layer and reduce the narrowing threshold, increasing audio quality. The user also added a Recurrent Neural Network (RNN) for Voice Activity Detection (VAD) and speech/music classification, implemented using dense layers and a GRU layer, demonstrating an interest in applying machine learning for audio processing. Further improvements included fixing bandwidth detection for 24 kHz analysis, and fixing CELT PLC and providing more detailed fixes.
Recurrent neural network for audio noise reduction
Role in this project:
Back-end Developer & ML Engineer
Contributions:1 release, 70 commits, 6 PRs in 2 months
Contributions summary:Jean-marc made substantial contributions to the audio noise reduction project, developing core functionalities of the denoiser. Their work included implementing forward and inverse Fourier transforms, analysis and synthesis components, and a band-based gain adjustment mechanism. The user integrated a Viterbi-based VAD (Voice Activity Detection) to improve the training data by reducing the amount of noise. They also worked on feature extraction, and applied a perceptual exponent during the training, all with the goal of improving the performance of the denoising model.
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.