Latest update: Nov 2024
Current focus: Multimodal Machine Learning & LLM / Robotics / Applied ML / On-device ML / Computer Vision
To recruiters: I usually do not respond to cold calls. A message without my reply, if read, indicates I am not interested.
To people asking for referrals: I only refer people I am familiar with.
How to connect with me: If you need coaching, career advices, or a connection for future collaboration, please clearly state in your message via email or LinkedIn. Directly connecting me via social media without a referral will be ignored.
I obtained my B.Eng in Computer Science at the University of Hong Kong and M.S. in Computer Vision at Carnegie Mellon University. My past undergraduate and graduate research focused on Artifical Intelligence, Computer Vision, Deep Learning, and Computer Graphics, with paper published at SIGGRAPH, IEEE GRSL (feat. on cover, 2018), and ACCV (Best Application Paper Award, 2018).
From early 2020 to late 2024, I was a Tech Lead and Senior Software Engineer at Waymo Perception team, shipping multimodal spatial-temporal algorithms and machine learning models to both the autonomous vehicles and the remote cloud services. I also led a part of the Perception system that utilizes large vision language models (VLMs) to improve Waymo's human-robot interaction and safety. I obtained 10+ spot bonuses by launching multiple impactful products and being a strong and caring team player.
In late 2024, I joined Apple. I have been researching, developing and deploying multimodal large language models (MLLM) on various Apple services and devices, also known as Apple Intelligence.
Although I mostly code for closed source projects (400k+ lines of code at Waymo), I also enjoy publishing open source code on GitHub, such as nanoGPT.jax and nanoDiffusion. I have been reviewing AI conference papers at AAAI, WACV, BMVC, etc. I provide LeetCode coaching for free.