Vibhu Jawa

California, United States

Join Prog.AI to see contacts

Summary

🤩

Rockstar

Vibhu Jawa is a Senior Software Engineer (Machine Learning/Data Science) at NVIDIA in San Francisco with 10 years of experience building GPU-accelerated data and ML infrastructure for large language models, graph neural networks, and scaled data engineering. He is an active RAPIDS open-source contributor, notable for back-end and performance work on cugraph (GPU-only sampling optimizations, fanout fixes, duplicate-edge checks and DGL integration) and implementing cuML’s HashingVectorizer. His work blends systems-level performance engineering—removing host-side bottlenecks to achieve substantial speedups—with practical ML tooling and documentation contributions across cuDF and community notebooks. Vibhu holds an MS in Computer Science from Johns Hopkins, where he researched temporal feature selection for EHRs and applied NLP to public health signals, demonstrating a strong bridge between research and production. Earlier roles at Expedia, Citi and other firms produced measurable impact in analytics, cloud automation and product systems, including dramatic reductions in environment delivery time. He pairs rigorous academic training with pragmatic engineering to accelerate ML workflows end-to-end.

11 years of coding experience

Stackoverflow

Stats

88reputation

10kreached

7answers

2questions

Github Skills (25)

graph-algorithms10

python10

back-end-development10

data-science10

dataframes10

machine-learning10

cudf10

dataframe10

rapids10

gpu10

performance-optimization10

cuda10

jupyter-notebook10

nlp10

cuml10

Programming languages (8)

DockerfileC++ShellCSSHTMLJupyter NotebookPythonCuda

Github contributions (5)

rapidsai/cugraph

May 2022 - Jan 2023

cuGraph - RAPIDS Graph Analytics Library

Role in this project:

Back-end Developer & Performance Engineer

Contributions:480 reviews, 86 commits, 67 PRs in 8 months

Contributions summary:Vibhu primarily focused on optimizing the performance of the `cugraph` library. They made significant improvements to the sampling functions within the `graph_store` module, achieving substantial speedups by removing host-side code and leveraging GPUs. Further contributions included fixing sampling behavior for the fanout parameter and optimizing the `has_duplicate_edges` function. The user also added features to support DGL integration like adding node and edge data along with the necessary storage implementations.

cudagraph-analysisanalyticsgraph-analyticsgraphml

rapidsai-community/notebooks-contrib

Feb 2019 - Jul 2020

RAPIDS Community Notebooks

Role in this project:

Full-stack Developer

Contributions:28 commits, 5 PRs, 9 comments in 1 year 4 months

Contributions summary:Vibhu's commits primarily involve updates to a Jupyter Notebook within the 'rapidsai-community/notebooks-contrib' repository, which focuses on RAPIDS Community Notebooks. The changes include modifications to code related to linear regression and word count analysis, specifically addressing missing datasets. The user has also merged branches, indicating integration work, and updated an nlp notebook. The commits demonstrate the use of cudf, nvstrings, and other related libraries.

jupyter-notebooknotebooksrapids

Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.

Request Free Trial