Claire Mcginty

Senior Data Engineer

New York, New York, United States
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts
email-iconphone-icongithub-logolinkedin-logotwitter-logostackoverflow-logofacebook-logo
Join Prog.AI to see contacts

Summary

🤩
Rockstar
🎓
Top School
Claire Mcginty is a Senior Data Engineer in New York with 12 years of backend and data-infrastructure experience, currently developing and maintaining Spotify's open-source Scio tooling. She specializes in large-scale batch and streaming systems on Google Cloud (Dataflow, Pub/Sub), with deep contributions to Apache Beam, Parquet, and Luigi that include Pub/Sub readers, Avro/Parquet schema fixes, and Dataflow job/task integrations. At LinkedIn she built recommendation pipelines using Kafka, Samza, and Hadoop, giving her strong production experience in real-time content relevance. A UC Berkeley EECS graduate who often tackles tricky I/O and schema conversion edge cases, she blends pragmatic engineering with active open-source stewardship.
code12 years of coding experience
job4 years of employment as a software developer
bookBachelor of Science (B.S.), Electrical Engineering and Computer Science, Bachelor of Science (B.S.), Electrical Engineering and Computer Science at University of California, Berkeley
stackoverflow-logo

Stackoverflow

Stats
46reputation
677reached
1answer
0questions
github-logo-circle

Github Skills (36)

avro10
google-cloud-platform10
conversions10
apache210
data-conversion10
dataflow-programming10
bigquery10
data-pipelines10
dataflow10
python10
schemaorg10
data-engineering10
data-serialization10
luigi10
hadoop10

Programming languages (6)

JavaScalaGoHTMLRubyPython

Github contributions (5)

github-logo-circle
spotify/scio

Mar 2018 - Jan 2023

A Scala API for Apache Beam and Google Cloud Dataflow.
Role in this project:
userBack-end Developer & Data Engineer
Contributions:9 releases, 533 reviews, 240 commits in 4 years 11 months
Contributions summary:Claire primarily contributed to back-end and data-engineering related tasks. They modified the default cancelJob value, added examples for streaming jobs with refreshing side inputs, and integrated a new data publishing system to Datastore, particularly for benchmarking results. They also made improvements to the build process and applied updates to dependencies.
apidatabeambatchbigquery
apache/parquet-java

Apr 2023 - Feb 2025

Apache Parquet Java
Role in this project:
userBack-end Developer
Contributions:73 reviews, 14 PRs, 89 comments in 1 year 10 months
Contributions summary:Claire primarily worked on the `parquet-avro` module, focusing on improvements and fixes related to Avro integration with Parquet. Their contributions included handling Avro schema conversion, fixing projections for repeated record types, and supporting extra metadata configuration for the Parquet writer. The user also implemented support for non-grouped repeated fields in the Avro schema converter and made improvements related to logical type conversions for different Avro versions.
avroparquetapachebig-dataapache-parquet
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.
Request Free Trial