Zeyao Du is an NLP Engineer based in Shanghai with eight years of experience building and deploying language models, search systems, and production NLP pipelines. He is the author of the 7k‑star GPT2-Chinese repository, where he adapted GPT-2 for Chinese by migrating code to the transformers library and improving tokenization, training, and generation workflows. Zeyao has driven LLM and NLP efforts at Shopee, 阅文集团 and Weibo, bridging research prototypes to production services at scale. With a mathematics and statistics degree from Zhejiang University, he combines rigorous quantitative thinking with a hands‑on focus on tokenizer and generation pipeline engineering.
9 years of coding experience
5 years of employment as a software developer
Nanjing Foreign Language School
Bachelor's degree, MATHEMATICS AND STATISTICS, Bachelor's degree, MATHEMATICS AND STATISTICS at Zhejiang University
Chinese version of GPT2 training code, using BERT tokenizer.
Role in this project:
ML Engineer
Contributions:45 commits, 21 PRs, 305 pushes in 1 year 9 months
Contributions summary:Zeyao primarily contributed to the training and generation aspects of the GPT-2 Chinese model. Their work involved updating existing code to leverage the `transformers` library, modifying training scripts, and fixing bugs related to the generation process. The user also made changes to tokenization scripts and configuration files, suggesting a hands-on role in adapting the model and its components for the Chinese language context.
GPT2 training script for Chinese in Tensorflow 2.0
Contributions:22 commits, 21 pushes, 1 branch in 9 months
nlpchinesetensorflow-2-0trainingtensorflow
Find and Hire Top DevelopersWe’ve analyzed the programming source code of over 60 million software developers on GitHub and scored them by 50,000 skills. Sign-up on Prog,AI to search for software developers.