David Majnemer is a Principal Engineer in San Francisco with 17 years of experience leading xPU ML compiler and runtime efforts at Google. He currently leads TPU and xPU compiler teams, driving performance for matrix multiply and convolution kernels, device runtimes, and XLA optimizations. An active open-source contributor, he has improved numerical stability and half-precision/bf16 support across flagship projects including TensorFlow, TensorFlow Probability, JAX and LLVM/NVPTX—work that reduced NaNs and improved accuracy on TPUs. His compiler back-end expertise spans NVPTX, Clang/LLVM, libc++ and LLDB, where he fixed code-generation bugs, ABI/exception handling issues and platform-specific robustness problems. Earlier roles tuning the Linux scheduler and contributing to Apple’s ZFS/CoreStorage give him a systems-level perspective that informs pragmatic performance engineering. A UIUC computer science graduate, he blends numerical rigor with production-grade engineering to ship ML compilers at scale.
A machine learning compiler for GPUs, CPUs, and ML accelerators
Role in this project:
Back-end Developer
Contributions:2 reviews, 279 commits, 7 comments in 6 years 1 month
Contributions summary:David contributed tests and code changes to the XLA project, a machine learning compiler. Their work focused on adding tests for specific computations involving floating-point operations, such as the square root and handling of NaNs. The user also made changes to the algebraic simplifier to optimize code, including power-of-two division and reassociation patterns.
Mirror kept for legacy. Moved to https://github.com/llvm/llvm-project
Role in this project:
Back-end Developer
Contributions:894 commits in 6 years 2 months
Contributions summary:David made several changes related to the Clang compiler, specifically focusing on code adjustments, including fixing edge cases when handling variable-length arrays and addressing issues with exception specifications. They implemented support for MSVC-style features like __declspec(noalias), and made code improvements related to memory management. Additionally, the user contributed to the correct handling of template arguments, particularly for class and member function pointers, indicating a strong understanding of compiler internals.
keptwindowsllvmcc-plus-plus
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.