Shahul Shereef - Founder at Kaggle

Shahul Shereef

Founder at Kaggle

San Francisco, California, United States

Join Prog.AI to see contacts

Summary

🤩

Rockstar

🎓

Top School

Shahul Shereef is a San Francisco–based data science leader and founder with eight years of experience building end-to-end ML and NLP systems in startups and production. As co-founder of ragas (YC W24), he’s driving an open-source standard for evaluating LLM applications while contributing to OpenAssistant and other community projects. A Kaggle Grandmaster ranked in the top 20 among 100,000+ users, he blends competition-grade modeling with practical MLOps—implementing dataset conversions for instruction-following dialogue and audio augmentations (RandomCrop, Padding, SpliceOut) for deep learning. His background includes credit-underwriting NLP, TensorFlow model deployment on Google Cloud, and adding evaluation metrics like BertScore and EditScore to open-source toolkits. Curious and product-minded, he publishes work and code publicly (shahules786.github.io) and focuses on tooling that makes ML systems more testable and reproducible.

8 years of coding experience

3 years of employment as a software developer

CGPA 8.01, CGPA 8.01 at Govt.model engineering college

Stackoverflow

Stats

1reputation

0reached

0answers

0questions

Github Skills (19)

pytorch10

sentence-transformers10

python10

testing10

machine-learning10

datasets10

jupyter-notebook10

nlp10

data-augmentation10

natural-language-processing9

deep-learning9

language-model8

audio8

machinelearning8

language-models8

Programming languages (10)

MDXC++ShellCJavaScriptPerlHTMLJupyter Notebook

Github contributions (5)

explodinggradients/ragas

May 2023 - Apr 2025

Supercharge Your LLM Application Evaluations 🚀

Role in this project:

Back-end Developer

Contributions:255 reviews, 582 PRs, 319 pushes in 1 year 10 months

Contributions summary:Shahul implemented a BertScore metric and added SBERT score calculation and relative imports within the `belar/metrics/similarity.py` file. Moreover, the user added EditScore metric with distance and ratio measures, and also a Bleu score. They also added Textual Entailment Score, fixed device checks, re-formatted imports and added the Q-square metric.

evaluationllmllmops

asteroid-team/torch-audiomentations

Mar 2022 - May 2022

Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.

Role in this project:

ML Engineer

Contributions:5 reviews, 52 commits, 4 PRs in 1 month

Contributions summary:Shahul primarily contributed to implementing and testing audio data augmentation techniques using PyTorch within the `torch-audiomentations` repository. They developed a `RandomCrop` augmentation, including initial implementation, base class initialization, type conversions, and testing. Furthermore, the user added a `Padding` augmentation and contributed to a `SpliceOut` augmentation. The commits focus on enhancing the library's audio processing capabilities, particularly for deep learning applications.

pythondifferentiable-data-augmentationaudio-datadspaudio-effects

Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.

Request Free Trial