I am a Senior Machine Learning Engineer at Keebo.AI, specializing in reinforcement learning for cloud data warehouse optimization. My expertise includes Reinforcement Learning, Machine Learning Theory, Natural Language Processing (NLP), and Large Language Models (LLMs).
Outside of my full-time job, I’m passionate about leveraging advanced AI to solve real-world challenges in healthcare, education, and the legal sector. My current focus is on AI-driven litigation applications. If you're interested in AI-driven ventures, I'd love to connect!
I earned my Thesis-Based Master's degree in the Computing Science Department from the University of Alberta, where I was fortunate to be supervised by Csaba Szepesvari in Reinforcement Learning and Bandits, and co-supervised by Or Sheffet in Differential Privacy. I have a strong passion for theoretical topics, and it's extremely inspiring to be part of AMII and RLAI, where brilliant ideas from exceptional scholars and students abound.
I earned my B.Sc. in Mathematics from The Chinese University of Hong Kong, an experience that was both intellectually enriching and personally transformative. Beyond acquiring a strong foundation in mathematics, I became more self-motivated, rigorous, and efficient. I also formed some of my most cherished friendships. I also had the privilege of being supervised by Professor Anthony So in machine learning research. That experience solidified my decision to make machine learning my lifelong pursuit.
Major : Mathematics (Enrichment Stream & Computational and Applied Mathematics Stream)
Minor : Computer Science
Robust Risk-Sensitive Reinforcement Learning Agents for Trading Markets.
Yue Gao*, Pablo Hernandez Leal, and Yik Chau (Kry) Lui.
Filed a Patent. 2021.
Accepted to ICML 2021 RL4RealLife
arXiv BlogLeveraging Non-uniformity in First-order Non-convex Optimization.
Jincheng Mei*, Yue Gao*, Bo Dai, Csaba Szepesvári, and Dale Schuurmans.
International Conference on Machine Learning (ICML), 2021.
PMLRDeveloped innovative reinforcement learning algorithms and bandit algorithm, for could data warehouse optimization, increased warehouse savings significantly.
Designed an optimized effective reward function for the reinforcement learning (RL) algorithm, enabling faster convergence and improved optimization. Built a reward simulator to facilitate intelligent interactions between the RL model and its environment.
Engineered and optimized a ML training pipeline on Google Cloud Platform (GCP), improving model training efficiency and scalability.
Led algorithm team stand-ups and strategic planning, ensuring alignment and efficient execution of team goals.
Spearheaded the transformation of complex supply chain route optimization challenges into robust RL/DL/ML models, leveraging CV and NLP techniques to enhance predictive accuracy and operational efficiency by over 30% on multiple metrics. Developed scalable pipelines and established industry-specific benchmarks
Proposed and developed paper-ready innovative reinforcement learning algorithms, incorporating novel variations of Q-Learning and deep learning approaches
Engineered and optimized robust end-to-end pipelines for multi-modal products, integrating diverse data types and systems to enhance product functionality and user engagement
Led cross-functional teams in the strategic development and delivery of professional presentations and demos
Proposed and validated 4 innovative algorithms for robust risk-sensitive reinforcement learning, tailored for trading markets, achieving a 40% increase in risk-adjusted returns compared to existing models.
Proposed an evaluation method of risk sensitive trading agents using empirical game theory.
Conducted comprehensive simulations using a streamlined trading market model to evaluate and demonstrate a 40% superior performance of our algorithms over conventional methods.
I worked with Professor Or Sheffet on differential privacy, and I'm pretty enthusiastic about this area. Lying on the intersection between machine learning and theory, differential privacy is a mathematically rigorous notation of preserving privacy in data analysis.
I'm also fortunate to with Professor Csaba Szepesvári on Reinforcement Learning, which I believe, stares into a bright future and is promising in many real-life applications.
I had the honor of being supervised by Professor Anthony So on Clustering Problems. Through this research, I found machine learning theory illuminating and made up my mind to take ML theory as my pursuit.
I enjoy singing and playing piano, and I'm also a guitar beginner. Recently I became a member of a band as keyboard player and sub-vocalist.
I'm a big fan of world geography. I enjoy experiencing different cultures, exploring local attractions, food, and especially, the museums.
I've been to different countries including Australia, France, Italy, Vatican, Maldives and so on. Here's my visited map .
I'm an absolute aviation ethusiast. I enjoy flying, collecting the information of each flight, watching aviation documentaries, and especially, taking aerial pictures.
Here are some of my aerial photographs.
Here are some of my self-made flightlogs.
Reading is an inseparable part of my life, and I'm particularly fascinated by philosophy and psychology books since I was a teenager. Reading academic papers in ML/RL is also a source of joy for me. My blogs include some of my reading summaries of academic papers.
I'm not professional in photography, in fact, I'm just a beginner. But I have a huge passion for photographing all kinds of skies and natural scenery.
Here is my sky and nature album.