Cheng-Yen (Wesley) Hsieh
CMU RI student | Machine Learning Research Scientist
I am a research scientist at ByteDance studying multi-modal language agents and embodied AI. I obtained my master degree in computer vision at Carnegie Mellon University, where I was advised by Prof. Deva Ramanan. My research experience lies in the fields of machine learning (ML) and computer vision (CV), including topics like self-supervised learning, amodal object tracking, and vision-language models. More specifically, my research pursuits are centered around the development of algorithms that enhance perceptual capabilities under challenging conditions, such as occlusion, leveraging minimal supervision and multimodal information.
Prior to my master journey, I received my B.S. from National Taiwan University. I had the pleasure to work on self-supervised representation learning with Prof. Yu-Chiang Frank Wang and federated learning with Prof. An-Yeu (Andy) Wu.
news
Mar 4, 2024 | Joined ByteDance as a research scientist in multi-modal language agents and embodied AI. |
---|---|
May 29, 2023 | Joined Waymo as a Machine Learning Engineer Intern. |
Aug 22, 2022 | I joined CMU RI as a master student in computer vision. |
selected publications
- in Submission, Nov 2023Our solution to unravel occlusion scenarios for any object—amodal tracking.
- IEEE WACV, 2023One can easily adapt and fine-tune the models for a variety of applications including multi-label classification, object detection,and instance segmentation with this pre-training algorithm.
- IEEE 32nd International Workshop on Machine Learning for Signal Processing (MLSP), 2022Split Learning (SL) for efficient image recognition through dimension-wise compression.
- IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems (AICAS), 2021Highly efficienct image recognition under the federated learning (FL) scenario.