Shirshanka Das is a seasoned software leader and Co-Founder & CTO in the San Francisco Bay Area with 13 years of experience building large-scale data and infrastructure systems. He co-founded DataHub and Acryl Data and previously served as Principal Staff Software Engineer at LinkedIn, where he architected GDPR strategy and helped build core systems like Databus and Espresso. An active open-source committer on Apache Gobblin and a key contributor to DataHub, his GitHub work spans Kafka integration, schema registry support, multi-MCE ingestion, structured properties, and column-level lineage for metadata pipelines. Trained at IIT Delhi and UCLA, he blends academic rigor with pragmatic engineering and product sense. Raised in a sleepy town in Bihar, he brings a mathematician’s precision to solving practical, production-scale data problems.
Contributions:32 releases, 2497 reviews, 940 commits in 2 years 4 months
Contributions summary:Shirshanka primarily focused on back-end development tasks, enhancing the functionality of the metadata ingestion pipeline. They added support for processing multiple MCEs in a single file, and fixed unit tests. Furthermore, they implemented code to support structured properties, and made improvements to database connection for the project's backend. They are responsible for adding column-level lineage.
A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems.
Role in this project:
Back-end Developer
Contributions:22 reviews, 59 commits, 90 PRs in 6 years
Contributions summary:Shirshanka primarily focused on enhancing the Gobblin data integration framework by implementing new features and improving existing ones. Their work includes adding support for Kafka writers, which involved creating and modifying Java files to integrate with Kafka schema registries, including the LiKafkaSchemaRegistry. Furthermore, the user made contributions to refactor and improve the codebase related to Hadoop file system helper classes and added a simple console writer.
datadcosdata-streambig-data-integrationbatch-data
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.