Abhinav Venigalla - Member Of Technical Staff at Databricks

Abhinav Venigalla

Member Of Technical Staff at Databricks

San Francisco, California, United States

Join Prog.AI to see contacts

Summary

🤩

Rockstar

🎓

Top School

Abhinav Venigalla is a Member of Technical Staff at Databricks with four years of industry experience building production ML systems that bridge research and engineering. Previously a research scientist at MosaicML and a machine learning researcher at Cerebras, he has contributed to prominent open-source projects like mosaicml/composer and llm-foundry, implementing optimizer improvements (DecoupledSGDW/AdamW) and integrating streaming C4 data loaders for scalable LLM training. He combines hands-on expertise in optimizer internals, dataset streaming, and LLM benchmarking with an MEng in EECS from MIT. Based in San Francisco, he focuses on making large-model training more efficient, reproducible, and production-ready.

4 years of coding experience

5 years of employment as a software developer

Bachelors + MEng, Electrical Engineering and Computer Science, Bachelors + MEng, Electrical Engineering and Computer Science at Massachusetts Institute of Technology

High School, High School at Phillips Academy

English, Telugu, Chinese

Github Skills (21)

algorithm10

optimizations10

pytorch10

python10

load-data10

optimizers10

machine-learning10

ml10

datasets10

llm10

data-loading10

data-loader10

deep-learning10

dataprocessing10

neural-networks10

Programming languages (3)

C++Jupyter NotebookPython

Github contributions (5)

mosaicml/composer

Feb 2022 - Jan 2023

Supercharge Your Model Training

Role in this project:

ML Engineer

Contributions:279 reviews, 49 commits, 94 PRs in 11 months

Contributions summary:Abhinav contributed to the `composer` repository by implementing and updating optimization techniques relevant to deep learning model training. They updated optimizers like `DecoupledSGDW` and `DecoupledAdamW`, which involved code changes to how weight decay is handled. Additionally, the user integrated streaming datasets such as the C4 dataset, which enhances the efficiency of model training. Further contributions involved updating dataset functionalities, fixing bugs, and general improvements to the code base.

pytorchml-systemsdeep-learningneural-networksmachine-learning

mosaicml/llm-foundry

May 2023 - May 2024

LLM training code for Databricks foundation models

Role in this project:

ML Engineer

Contributions:68 reviews, 85 PRs, 189 pushes in 1 year

Contributions summary:Abhinav primarily contributed to the development of an LLM benchmark, focusing on the implementation and integration of data loading and processing pipelines within the llm-foundry project. Their work involved modifications to the dataset loading process, specifically for the C4 dataset, incorporating features such as truncation and concatenation strategies. These changes likely aimed to improve the efficiency and functionality of the data handling for LLM training. Furthermore, the user's contributions involved upgrading the LLM benchmark to use a newer version of Composer, indicating their involvement in adapting to and integrating updates within the project's framework.

deep-learningllmneural-networksnlppytorch

Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.

Request Free Trial