Eugene Zhulenev

Software Engineer at Google

San Francisco, California, United States
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts

Summary

🤩
Rockstar
Eugene Zhulenev is a Staff Software Engineer at Google Brain in San Francisco with 13 years of experience building high-performance infrastructure for data analysis and machine learning, focused on low-level runtime performance for CPUs and GPUs. He brings deep expertise in C++ performance tuning, multi-threaded systems, ML compilers and runtimes, cloud-native distributed architectures, and functional Scala for Big Data. A core OpenXLA contributor and former Eigen maintainer, his work on tensor contraction and kernel optimizations produced 2–3x speedups in TensorFlow and saved millions in hardware costs. He pairs production-grade engineering with hands-on compiler work — from small-shape copy optimizations in XLA/CPU to CUDA buffer integration in IREE and persistent compilation caching in JAX — bridging research and deployable runtime efficiency.
code14 years of coding experience
stackoverflow-logo

Stackoverflow

Stats
9,734reputation
467kreached
182answers
14questions
Badges
sbt
top-5%
scala
top-1%
rdd
top-5%
intellij-idea
top-5%
apache-spark
top-1%
serialization
top-5%
github-logo-circle

Github Skills (57)

apache-spark10
c-language10
compilation10
python10
algebra10
operation10
llvm10
tensorrt10
cpp-1110
machine-learning10
c1110
python-templates10
mlr10
scala10
c1710

Programming languages (9)

JavaC++ShellStarlarkLLVMScalaMLIRJupyter Notebook

Github contributions (5)

github-logo-circle
tensorflow/runtime

Apr 2020 - Dec 2022

A performant and modular runtime for TensorFlow
Role in this project:
userBack-end Developer
Contributions:468 commits in 2 years 8 months
Contributions summary:Eugene primarily contributed to the core runtime and compilation aspects of the TensorFlow project, specifically focusing on the JitRt (Just-In-Time Runtime) and its associated kernel implementations. The commits reveal work on enhancing the handling of memory management, particularly for tensor operations, and integrating custom call functionality for improved performance. Furthermore, the user implemented several new native operations and also modified and improved existing kernels that are core to the runtime.
runtimeperformantmodulartensorflow
openxla/xla

Jan 2019 - Jan 2023

A machine learning compiler for GPUs, CPUs, and ML accelerators
Role in this project:
userBack-end Developer
Contributions:462 reviews, 378 commits, 3 PRs in 4 years 1 month
Contributions summary:Eugene contributed to the XLA compiler project, with the primary focus on improving the runtime and the core components. Their work includes enabling support for custom contraction kernels in XLA's single-threaded matrix multiplication, implementing batch normalization through the cuDNN BatchNormEx API, and fixing ODR violations in Eigen contraction kernels. Furthermore, they added support for executing XLA:GPU on top of JitRt and improved the XLA runtime library.
compilercommunity-drivenmachine-learningmodular
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.
Request Free Trial