GAO YUE (ANNA)

Great Toronto Area · Canada gao12@ualberta.ca

I am a Senior Machine Learning Engineer at Keebo.AI, specializing in reinforcement learning for cloud data warehouse optimization. My expertise includes Reinforcement Learning, Machine Learning Theory, Natural Language Processing (NLP), and Large Language Models (LLMs).

Outside of my full-time job, I’m passionate about leveraging advanced AI to solve real-world challenges in healthcare, education, and the legal sector. My current focus is on AI-driven litigation applications. If you're interested in AI-driven ventures, I'd love to connect!

I earned my Thesis-Based Master's degree in the Computing Science Department from the University of Alberta, where I was fortunate to be supervised by Csaba Szepesvari in Reinforcement Learning and Bandits, and co-supervised by Or Sheffet in Differential Privacy. I have a strong passion for theoretical topics, and it's extremely inspiring to be part of AMII and RLAI, where brilliant ideas from exceptional scholars and students abound.

I earned my B.Sc. in Mathematics from The Chinese University of Hong Kong, an experience that was both intellectually enriching and personally transformative. Beyond acquiring a strong foundation in mathematics, I became more self-motivated, rigorous, and efficient. I also formed some of my most cherished friendships. I also had the privilege of being supervised by Professor Anthony So in machine learning research. That experience solidified my decision to make machine learning my lifelong pursuit.

My Google Scholar Page

Education

University of Alberta - AMII/RLAI

Msc of Science

Computing Science - Thesis Based

GPA : 3.94/4.0

Supervisor : Csaba Szepesvari

My Thesis Here : Improving Different Aspects in RL - Accelerating Convergence Rate & Enhancing Safety and Robustness

September 2018 - September 2021

University of California, Berkeley - Simons Institute

Visiting Student

Data Privacy: Foundations and Applications Program

2019

The Chinese University of Hong Kong

Bachelor of Science

Major : Mathematics (Enrichment Stream & Computational and Applied Mathematics Stream)

Minor : Computer Science

Awards

September 2014 - May 2018

University of Toronto

Non-Degree Exchange Programme

September 2016 - December 2016

Publications & Patents(* Indicates First Author)

Stable CDE Autoencoders with Acuity Regularization for Offline Reinforcement Learning in Sepsis Treatment

Yue Gao*

Accepted to IJCAI 2025 AI4TS

arXiv Code

Robust Risk-Sensitive Reinforcement Learning Agents for Trading Markets.

Yue Gao*, Pablo Hernandez Leal, and Yik Chau (Kry) Lui.

Filed a Patent. 2021.

Accepted to ICML 2021 RL4RealLife

arXiv Blog

Leveraging Non-uniformity in First-order Non-convex Optimization.

Jincheng Mei*, Yue Gao*, Bo Dai, Csaba Szepesvári, and Dale Schuurmans.

International Conference on Machine Learning (ICML), 2021.

PMLR

Private Approximations of a Convex Hull in Low Dimensions

Yue Gao*, Or Sheffet

Information-Theoretic Cryptography (ITC), 2021.

arXiv ITC 2021 Video

Robust and Efficient RL Methods Solving Capacitated and Time-Based Vehicle Routing Problem

Yue Gao*, Katrina Hooper*, Chirstophe Pennetier*

Filed a patent, 2024

Experience

Senior Machine Learning Engineer

Keebo.AI

Developed innovative reinforcement learning algorithms and bandit algorithm, for could data warehouse optimization, increased warehouse savings significantly.

Built a high-fidelity reward simulator to facilitate intelligent interactions between the RL agent and its environment; Deployed on Google Cloud Platform (GCP) to support robust offline RL training, improve iteration speed and inference speed, and ensure scalability across customer-specific scenarios.

Implemented a modern, microservice-based ML infrastructure on Google Cloud Platform (GCP) to support end-to-end RL model lifecycle: training, evaluation, CI/CD deployment, monitoring, and real-time inference at scale.

Led algorithm team stand-ups and retro planning, ensuring alignment and efficient execution of team goals.

Aug 2024 - Present

AI Research Scientist

Quincus

Spearheaded the transformation of complex supply chain route optimization challenges into robust RL/DL/ML models, leveraging CV and NLP techniques to enhance predictive accuracy and operational efficiency by over 30% on multiple metrics. Developed scalable pipelines and established industry-specific benchmarks

Proposed and developed paper-ready innovative reinforcement learning algorithms, incorporating novel variations of Q-Learning and deep learning approaches

Engineered and optimized robust end-to-end pipelines for multi-modal products, integrating diverse data types and systems to enhance product functionality and user engagement

Led cross-functional teams in the strategic development and delivery of professional presentations and demos

Dec 2021 - Aug 2024

Research Internship

BorealisAI

Proposed and validated 4 innovative algorithms for robust risk-sensitive reinforcement learning, tailored for trading markets, achieving a 40% increase in risk-adjusted returns compared to existing models.

Proposed an evaluation method of risk sensitive trading agents using empirical game theory.

Conducted comprehensive simulations using a streamlined trading market model to evaluate and demonstrate a 40% superior performance of our algorithms over conventional methods.

Jan 2021 - July 2021

Research Assistant

AMII, University of Alberta

I worked with Professor Or Sheffet on differential privacy, and I'm pretty enthusiastic about this area. Lying on the intersection between machine learning and theory, differential privacy is a mathematically rigorous notation of preserving privacy in data analysis.

I'm also fortunate to with Professor Csaba Szepesvári on Reinforcement Learning, which I believe, stares into a bright future and is promising in many real-life applications.

Jan 2019 - Sep 2021

Undergraduate Research Assistant

SEEM, CUHK

I had the honor of being supervised by Professor Anthony So on Clustering Problems. Through this research, I found machine learning theory illuminating and made up my mind to take ML theory as my pursuit.

Jun 2017 - Sep 2017

Team Assistant

Boston Consulting Group (Beijing)

Aug 2016 - Sep 2016

Skills

Programming Languages & Tools

I have abundant experiences developing end-to-end pipelines, and I'm proficient in following languages & tools :

Programming Languages/ Web Frameworks: Python, SQL, Golang, React, Java, JavaScript

Machine Learning: Pytorch, Tensorflow, Transformers, NumPy, Pandas, Scikit Learn, OpenCV, LightGBM, XGBoost, Keras, Huggingface, Nltk, LSTM, LangChain, LlamaIndex, Streamlit, Taipy, Dspy, Ragas

DevOps & Cloud: GCP, AWS, Docker, Kubernetes, Helm

Research Focus in AI/ML: Reinforcement Learning; Machine Learning; Differential Privacy; Deep Learning; Bandit, NLP and LLM

Languages

Mandarin : Native

English : Full Professional Proficiency

Cantonese : Entry Level

Hobbies

Music

I enjoy singing and playing piano, and I'm also a guitar beginner. Recently I became a member of a band as keyboard player and sub-vocalist.

Travelling

I'm a big fan of world geography. I enjoy experiencing different cultures, exploring local attractions, food, and especially, the museums.

I've been to different countries including Australia, France, Italy, Vatican, Maldives and so on. Here's my visited map .

Aviation

I'm an absolute aviation ethusiast. I enjoy flying, collecting the information of each flight, watching aviation documentaries, and especially, taking aerial pictures.

Here are some of my aerial photographs.

Here are some of my self-made flightlogs.

Reading

Reading is an inseparable part of my life, and I'm particularly fascinated by philosophy and psychology books since I was a teenager. Reading academic papers in ML/RL is also a source of joy for me. My blogs include some of my reading summaries of academic papers.

Sky and Nature Photography

I'm not professional in photography, in fact, I'm just a beginner. But I have a huge passion for photographing all kinds of skies and natural scenery.

Here is my sky and nature album.