GAO YUE (ANNA)

Greater Toronto Area · Canada gao12@ualberta.ca

I am a Senior Machine Learning Engineer at GPTZero, focusing on AI (LLM) detection model improvement. My expertise includes Reinforcement Learning, Machine Learning Theory, Natural Language Processing (NLP), and Large Language Models (LLMs).

I earned my Thesis-Based Master's degree in the Computing Science Department from the University of Alberta, where I was fortunate to be supervised by Csaba Szepesvari in Reinforcement Learning and Bandits. And it's extremely inspiring to be part of AMII and RLAI, where brilliant ideas from exceptional scholars and students abound.

I earned my B.Sc. in Mathematics from The Chinese University of Hong Kong. I was fortunate to be supervised by Anthony So in machine learning research. That experience solidified my decision to make machine learning my lifelong pursuit.

Education

University of Alberta - AMII ; RLAI Lab

Msc of Science

Computing Science - Thesis Based

GPA : 3.94/4.0

Supervisor : Csaba Szepesvari

Thesis : Improving Different Aspects in RL - Accelerating Convergence Rate & Enhancing Safety and Robustness

2018 - 2021

University of California, Berkeley - Simons Institute

Visiting Postgraduate Student

Attended Data Privacy: Foundations and Applications Program under supervision of Or Sheffet

2019

The Chinese University of Hong Kong

Bachelor of Science

Major : Mathematics (Double streams in pure math & applied math)

Minor : Computer Science

Awards and Scholarships

2014 - 2018

University of Toronto

Non-Degree Exchange Programme

September 2016 - December 2016

Publications & Patents(* Indicates First Author)

Stable CDE Autoencoders with Acuity Regularization for Offline Reinforcement Learning in Sepsis Treatment

Yue Gao*

Accepted to IJCAI2025 AI4TS

Published on Transactions on Artificial Intelligence, 2025

Paper Code

Robust Risk-Sensitive Reinforcement Learning Agents for Trading Markets.

Yue Gao*, Pablo Hernandez Leal, and Yik Chau (Kry) Lui.

Filed a Patent. 2021.

Accepted to ICML 2021 RL4RealLife

arXiv Blog

Leveraging Non-uniformity in First-order Non-convex Optimization.

Jincheng Mei*, Yue Gao*, Bo Dai, Csaba Szepesvári, and Dale Schuurmans.

International Conference on Machine Learning (ICML), 2021.

PMLR

Private Approximations of a Convex Hull in Low Dimensions

Yue Gao*, Or Sheffet

Information-Theoretic Cryptography (ITC), 2021.

arXiv ITC 2021 Video

Robust and Efficient RL Methods Solving Capacitated and Time-Based Vehicle Routing Problem

Yue Gao*, Katrina Hooper*, Chirstophe Pennetier*

Filed a patent, 2024

Experience

Senior Machine Learning Engineer

GPTZero, Toronto

Improving AI detection model.

Feb 2026 - Present

Senior Machine Learning Engineer

Keebo.AI

Designed and deployed a production-scale reinforcement learning optimization system for 2,000+ cloud data warehouses used by top-tier Snowflake enterprise customers, establishing state-of-the-art performance in the cloud warehouse optimization industry and improving average cost savings from 8%→16% while maintaining strict latency SLAs.

Developed a novel reward model and policy learning strategy that significantly improved convergence stability and policy generalization in large-scale offline reinforcement learning, enabling robust deployment across real-world warehouses.

Engineered large-scale offline RL pipelines on GCP (BigQuery, GKE, Trainer Jobs, Airflow) with distributed training and automated evaluation, combining robust theory with reliable production execution.

Developed an adaptive RL aggressiveness control DAG using Airflow, dynamically adjusting model aggressiveness per warehouse based on live production metrics. Improved savings-performance tradeoff through automated, feedback-driven hyperparameter adaptation.

Designed one of the industry's first query routing models (80%+ accuracy) to power Keebo's intelligent query routing, which dynamically directs queries to optimal warehouses. Built the full research pipeline from schema-aware tokenization to UMAP-based embedding diagnostics.

Led ML team sprint planning and internal technical education, delivering presentations on RL fundamentals, MLOps best practices, and architecture deep dives to the broader engineering org.

Aug 2024 - Feb 2026

AI Research Scientist

Quincus

Spearheaded the transformation of complex supply chain route optimization challenges into robust RL/DL/ML models, leveraging CV and NLP techniques to enhance predictive accuracy and operational efficiency by over 30% on multiple metrics. Developed scalable pipelines and established industry-specific benchmarks

Proposed and developed paper-ready innovative reinforcement learning algorithms, incorporating novel variations of Q-Learning and deep learning approaches

Engineered and optimized robust end-to-end pipelines for multi-modal products, integrating diverse data types and systems to enhance product functionality and user engagement

Led cross-functional teams in the strategic development and delivery of professional presentations and demos

Dec 2021 - Aug 2024

Research Internship

BorealisAI

Proposed and validated 4 innovative algorithms for robust risk-sensitive reinforcement learning, tailored for trading markets, achieving a 40% increase in risk-adjusted returns compared to existing models.

Proposed an evaluation method of risk sensitive trading agents using empirical game theory.

Conducted comprehensive simulations using a streamlined trading market model to evaluate and demonstrate a 40% superior performance of our algorithms over conventional methods.

Jan 2021 - July 2021

Research Assistant

AMII, University of Alberta

I worked with Professor Or Sheffet on differential privacy. Lying on the intersection between machine learning and computational theory, differential privacy is a mathematically rigorous notation of preserving privacy in data analysis.

I'm also fortunate to work with Professor Csaba Szepesvári on Reinforcement Learning, which I believe, is the correct path towards AGI.

Jan 2019 - Sep 2021

Undergraduate Research Assistant

SEEM, CUHK

I was fortunate to be supervised by Professor Anthony So on Clustering Problems. Through this research, I found machine learning theory illuminating and made up my mind to take ML theory as my pursuit.

Jun 2017 - Sep 2017

Internship

Boston Consulting Group (Beijing)

Aug 2016 - Sep 2016

Skills

Core Expertise

Reinforcement Learning, Large Language Models, Machine Learning Systems, Deep Learning, Scalable AI Infrastructure, Agentic AI

Technical Skills

Programming: Python, Golang, SQL, Java, JavaScript, React
Machine Learning & Deep Learning: PyTorch, TensorFlow, Transformers, Hugging Face, JAX, Triton, NumPy, Pandas, Scikit-learn, OpenCV, LightGBM, XGBoost, Keras
Distributed Systems & MLOps: GCP (GKE, VertexAI, BigQuery, Pub/Sub), AWS (SageMaker, EC2), Airflow, Docker, Kubernetes, Terraform, Spark, Helm, CI/CD, Blue-Green Deployment
Databases & Query Engines: BigQuery, PostgreSQL, VecDB
Performance & GPU Optimization: NVIDIA CUDA, NVTX, Triton Inference Server, Distributed Training, Mixed Precision, Profiling
Large Language Models & Agents: Pre-training and Fine-tuning, RLHF/DPO, LangChain, LlamaIndex, DSPy, Ragas, RAG, Evaluation Frameworks
Reinforcement Learning: Offline RL, Contextual Bandits, Multi-Agent RL, Risk-Sensitive RL, Policy Gradient Methods, Reward Design

Languages

Mandarin: Native
English: Full Professional Proficiency
Cantonese: Entry Level

Hobbies

Music

I enjoy singing and playing piano, and I'm also a guitar & drum kit beginner.

Travelling

I'm a big fan of world geography. I enjoy experiencing different cultures, exploring local attractions, and museums.

Here's my visited map .

Aviation

I'm an absolute aviation ethusiast. I enjoy flying, meticulously logging details of each flight, watching aviation documentaries, and especially capturing aerial photography. I'm particularly interested in commercial aircraft—I enjoy learning about different models, their specifications, and being able to identify them.

Here are some of my self-made flightlogs.

Reading

Reading is central to my life. I read broadly across philosophy, psychology, science, physics, sociology, and economics. Nowadays, most of my reading time goes to ML/RL research papers, which I find deeply engaging. My blog features reading summaries and reflections on academic papers.

Sky and Nature Photography

I have a huge passion for photographing all kinds of skies and natural scenery.

Here is my sky and nature album.