Cheng-Yen (Wesley) Hsieh
CMU RI student | Machine Learning Research Scientist
I am a research scientist at ByteDance Research, based in San Jose.
My research covers machine learning and computer vision. My current research focuses on advancing AI for scientific discovery. Specifically, I develop large-scale multi-modal diffusion language models to tackle protein modeling. My earlier work also explored foundation research areas, including self-supervised learning, amodal object tracking, federated learning, and vision language models.
I was advised by Prof. Deva Ramanan during my master of science in computer vision at Carnegie Mellon University. I received my B.S. from National Taiwan University. I had the pleasure to work with Prof. Yu-Chiang Frank Wang and Prof. An-Yeu (Andy) Wu.
news
May 1, 2024 | DPLM-2.1 is accepted as a Spotlight at ICML 2025! |
---|---|
Apr 16, 2024 | Launched the official page of our DPLM series. |
Mar 4, 2024 | Joined ByteDance as a AI research scientist. |
May 29, 2023 | Joined Waymo as a Machine Learning Engineer Intern. |
Aug 22, 2022 | I joined CMU RI as a master student in computer vision. |
selected publications
- ICML, 2025 (Spotlight)Design choices are essential: Our designs enable the 650M multimodal PLM to outperform 3B-scale baselines and specialized structure folding models.
- arXiv preprint, Nov 2023Our solution to unravel occlusion scenarios for any object—amodal tracking.
- IEEE WACV, 2023One can easily adapt and fine-tune the models for a variety of applications including multi-label classification, object detection,and instance segmentation with this pre-training algorithm.
- IEEE 32nd International Workshop on Machine Learning for Signal Processing (MLSP), 2022Split Learning (SL) for efficient image recognition through dimension-wise compression.
- IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems (AICAS), 2021Highly efficienct image recognition under the federated learning (FL) scenario.