Suraj Subramanian

Menlo Park, California, United States
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts

Summary

🤩
Rockstar
Suraj Subramanian is a Menlo Park–based software engineer with around a decade of experience building ML systems across finance and healthcare. He blends hands-on MLOps and ML engineering—specializing in distributed PyTorch training, DDP, multi-GPU/multi-node workflows, and experiment tracking—with full‑stack contributions to open-source projects. His work spans practical RL tutorials (Mario), production-ready training tooling (snapshot/resume, SLURM, torchrun), and high-impact docs and integrations for Llama2 on Hugging Face in the popular Llama cookbook. Notably, he reorganized repo structures and quickstart notebooks to make complex model workflows easier to reproduce, reflecting a bias for developer ergonomics as well as performance. Colleagues rely on him to move models from research prototypes into reliable, scalable production pipelines.
code10 years of coding experience
languagesTamil, Hindi, English
github-logo-circle

Github Skills (24)

transformers10
pytorch10
slurm10
python10
ddp10
llama10
data-parallel10
machine-learning10
multiple-gpu10
reinforcement-learning10
data-parallelism10
huggingface10
mlops10
multi-gpu10
wandb10

Programming languages (8)

TypeScriptJavaRJavaScriptGoHTMLJupyter NotebookPython

Github contributions (5)

github-logo-circle
meta-llama/llama-cookbook

Feb 2024 - Dec 2024

Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama model family and using them on various provider services
Role in this project:
userFull-stack Developer
Contributions:10 reviews, 34 PRs, 31 pushes in 9 months
Contributions summary:Suraj appears to have been involved in the restructuring of the repository's file organization and updating the main README. They added new notebooks to the quickstart guide, specifically for running Llama2 on Hugging Face transformers, and consolidated images into a top-level folder. The commit also included changes to the code differences of a notebook, showcasing work on running Llama models using the Hugging Face transformers library. These changes suggest involvement in both the front-end documentation and backend model integration.
aifinetuninglangchainllamallama2
pytorch/examples

Sep 2022 - Nov 2022

A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.
Role in this project:
userMLOps Engineer
Contributions:31 reviews, 14 commits, 11 PRs in 2 months
Contributions summary:Suraj primarily contributes to setting up and configuring distributed training environments for PyTorch models, specifically using DDP (DistributedDataParallel). Their work focuses on creating scripts and configurations for multi-GPU and multi-node training using tools like `torchrun` and SLURM. They integrate features like snapshotting and resuming training, enhancing the training workflow. Furthermore, they introduce minGPT-based training, demonstrating expertise in distributed training and potentially automated model deployment.
pytorchvisiondeep-learningreinforcement-learningreinforcement
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.
Request Free Trial