Cheng-Yen (Wesley) Hsieh

CMU RI student | Machine Learning Research Scientist

I am a research scientist at ByteDance Research, based in San Jose.

My research covers machine learning and computer vision. My current research focuses on advancing AI for scientific discovery. Specifically, I develop large-scale multi-modal diffusion language models to tackle protein modeling. My earlier work also explored foundation research areas, including self-supervised learning, amodal object tracking, federated learning, and vision language models.

I was advised by Prof. Deva Ramanan during my master of science in computer vision at Carnegie Mellon University. I received my B.S. from National Taiwan University. I had the pleasure to work with Prof. Yu-Chiang Frank Wang and Prof. An-Yeu (Andy) Wu.





Github / Google Scholar / chengyenhsieh0806@gmail.com

news

May 1, 2024 DPLM-2.1 is accepted as a Spotlight at ICML 2025!
Apr 16, 2024 Launched the official page of our DPLM series.
Mar 4, 2024 Joined ByteDance as a AI research scientist.
May 29, 2023 Joined Waymo as a Machine Learning Engineer Intern.
Aug 22, 2022 I joined CMU RI as a master student in computer vision.

selected publications

  1. Cheng-Yen Hsieh, Xinyou Wang, Daiheng Zhang, Dongyu Xue, Fei Ye, Shujian Huang, Zaixiang Zheng, and Quanquan Gu
    ICML, 2025 (Spotlight)
    Design choices are essential: Our designs enable the 650M multimodal PLM to outperform 3B-scale baselines and specialized structure folding models.
  2. Cheng-Yen HsiehKaihua ChenAchal DaveTarasha Khurana, and Deva Ramanan
    arXiv preprint, Nov 2023
    Our solution to unravel occlusion scenarios for any object—amodal tracking.
  3. Cheng-Yen Hsieh, Chih-Jung Chang, Fu-En Yang, and Yu-Chiang Frank Wang
    IEEE WACV, 2023
    One can easily adapt and fine-tune the models for a variety of applications including multi-label classification, object detection,and instance segmentation with this pre-training algorithm.
  4. Cheng-Yen Hsieh, Yu-Chuan Chuang, and An-Yeu (Andy) Wu
    IEEE 32nd International Workshop on Machine Learning for Signal Processing (MLSP), 2022
    Split Learning (SL) for efficient image recognition through dimension-wise compression.
  5. Cheng-Yen Hsieh, Yu-Chuan Chuang, and An-Yeu (Andy) Wu
    IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems (AICAS), 2021
    Highly efficienct image recognition under the federated learning (FL) scenario.