Education
University of Waterloo
Experience
HeyGen
May 2025 – Aug 2025- Optimized AI video generation model by integrating sage attention into the PyTorch pipeline; utilized outlier smoothing and Int4 quantization to decrease end-to-end latency by 10%.
- Engineered logging and profiling utilities to expose metrics on memory usage and inference bottlenecks.
- Architected a storage migration from AWS S3 to Cloudflare R2 for model checkpoints, reducing storage costs by 35%.
Huawei
Jan 2025 – Apr 2025- Benchmarked Microsoft's BitNet against llama.cpp and llama3.c, conducting top-down microarchitecture analysis to isolate compute/memory bottlenecks.
- Designed a custom C++ data prefetcher with page boundary awareness, resulting in a 6% speedup in mobile inference benchmarks.
- Proposed a specialized kernel for 1-bit LLMs supporting fine-grained structured sparsity, reducing MatMul overhead by 20%.
Tactic Studios
May 2024 – Aug 2024- Developed 20+ responsive, data-driven UI modules in Java for RPG title "Killer Inn" published by Square Enix.
- Engineered "Expression Resources," a core engine tool enabling dynamic object referencing, decoupling game logic from asset data and accelerating iteration cycles.
Besty AI
Sep 2023 – Dec 2023- Integrated GPT-4 into automated upselling workflows for rental property hosts, generating over $300 in additional weekly revenue per user post-launch.
- Built a real-time product analytics dashboard in React.js visualizing 500+ daily interactions using LLAMA-2 for guest intent classification.
- Optimized backend performance by deploying Node.js workers for SQL preprocessing, capping API response time at 100ms.
Behaviour Interactive
Jan 2023 – Apr 2023- Spearheaded gameplay development of the "Dead by Daylight" 7th Anniversary update in C++, collaborating with 20+ engineers.
- Created an object highlighting system supporting numerous shader properties and events on game objects.
- Leveraged Unreal Engine's network replication system for stability under 200ms latency and 2% packet loss.
Projects
ThermaLM
↗Thermal-aware LLM inference scheduler that dynamically adapts configuration based on device thermal and battery state. Achieves 10% reduced wattage and 14% longer battery life on Pixel 8 via KV-cache quantization, flash attention, and sparsity.
Sensor-Driven Violin Performance Capture
↗Augmented violin using Raspberry Pi to capture high-fidelity telemetry — bow pressure, position, and pitch — with >85% detection accuracy. Curated a novel dataset of expressive performance metrics from ten violinists.
ShopBot
↗Multimodal AI shopping assistant supporting text, voice, and visual search. Uses LangGraph for stateful agent orchestration, CLIP + FAISS for local visual similarity, and Gemini for reasoning, TTS, and STT.
Tuesday Night Tempo
↗A rhythm game that uses a real drumset as the controller. Built in Unity with custom input handling to map live drum hits to in-game note lanes.