Abhinav Venigalla

Member Of Technical Staff at Databricks

San Francisco, California, United States
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts

Summary

🤩
Rockstar
🎓
Top School
Abhinav Venigalla is a Member of Technical Staff at Databricks with four years of industry experience building production ML systems that bridge research and engineering. Previously a research scientist at MosaicML and a machine learning researcher at Cerebras, he has contributed to prominent open-source projects like mosaicml/composer and llm-foundry, implementing optimizer improvements (DecoupledSGDW/AdamW) and integrating streaming C4 data loaders for scalable LLM training. He combines hands-on expertise in optimizer internals, dataset streaming, and LLM benchmarking with an MEng in EECS from MIT. Based in San Francisco, he focuses on making large-model training more efficient, reproducible, and production-ready.
code4 years of coding experience
job5 years of employment as a software developer
bookBachelors + MEng, Electrical Engineering and Computer Science, Bachelors + MEng, Electrical Engineering and Computer Science at Massachusetts Institute of Technology
bookHigh School, High School at Phillips Academy
languagesEnglish, Telugu, Chinese
github-logo-circle

Github Skills (21)

algorithm10
optimizations10
pytorch10
python10
load-data10
optimizers10
machine-learning10
ml10
datasets10
llm10
data-loading10
data-loader10
deep-learning10
dataprocessing10
neural-networks10

Programming languages (3)

C++Jupyter NotebookPython

Github contributions (5)

github-logo-circle
mosaicml/composer

Feb 2022 - Jan 2023

Supercharge Your Model Training
Role in this project:
userML Engineer
Contributions:279 reviews, 49 commits, 94 PRs in 11 months
Contributions summary:Abhinav contributed to the `composer` repository by implementing and updating optimization techniques relevant to deep learning model training. They updated optimizers like `DecoupledSGDW` and `DecoupledAdamW`, which involved code changes to how weight decay is handled. Additionally, the user integrated streaming datasets such as the C4 dataset, which enhances the efficiency of model training. Further contributions involved updating dataset functionalities, fixing bugs, and general improvements to the code base.
pytorchml-systemsdeep-learningneural-networksmachine-learning
mosaicml/llm-foundry

May 2023 - May 2024

LLM training code for Databricks foundation models
Role in this project:
userML Engineer
Contributions:68 reviews, 85 PRs, 189 pushes in 1 year
Contributions summary:Abhinav primarily contributed to the development of an LLM benchmark, focusing on the implementation and integration of data loading and processing pipelines within the llm-foundry project. Their work involved modifications to the dataset loading process, specifically for the C4 dataset, incorporating features such as truncation and concatenation strategies. These changes likely aimed to improve the efficiency and functionality of the data handling for LLM training. Furthermore, the user's contributions involved upgrading the LLM benchmark to use a newer version of Composer, indicating their involvement in adapting to and integrating updates within the project's framework.
deep-learningllmneural-networksnlppytorch
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.
Request Free Trial
Abhinav Venigalla - Member Of Technical Staff at Databricks