Cheng-Yen (Wesley) Hsieh

CMU RI | Machine Learning Research Scientist

I’m a Senior Research Scientist at ByteDance Seed, based in San Jose. My research spans the fields of machine learning, computer vision, and AI for scientific discovery, with a core focus on developing large-scale generative foundation models. I specialize in leveraging diffusion and language models to advance a wide range of applications, from biomolecular modeling to video generation.

At ByteDance Seed, I co-lead the development of generative biomolecular foundation models, including the DPLM and PAR series. My work covers the full model lifecycle, including large-scale pre-training and mid-training strategies that unify protein sequences with 3D structural data. I also worked on video generation, where I built 3D-consistent Diffusion Transformers to improve temporal and spatial coherence in video synthesis.

My earlier research explored foundational areas of representation learning and perception. This includes developing self-supervised pyramid learning for visual analysis, advancing amodal object tracking through the TAO-Amodal benchmark, and investigating vision-language models. I also worked on distributed ML systems, having published work on federated learning and communication-efficient split learning.

I earned my Master of Science in Computer Vision (MSCV) from Carnegie Mellon University, advised by Prof. Deva Ramanan. I received my B.S. in Electrical Engineering from National Taiwan University, where I had the pleasure of working with Prof. Yu-Chiang Frank Wang and Prof. An-Yeu (Andy) Wu.

Github / Google Scholar / chengyenhsieh0806@gmail.com

news

May 1, 2024 DPLM-2.1 is accepted as a Spotlight at ICML 2025!
Apr 16, 2024 Launched the official page of our DPLM series.
Mar 4, 2024 Joined ByteDance Seed as an AI research scientist.
May 29, 2023 Joined Waymo as a Machine Learning Engineer Intern.
Aug 22, 2022 I joined CMU RI as a master student in computer vision.

selected publications

  1. Yanru Qu*, Cheng-Yen Hsieh*†, Zaixiang Zheng, Ge Liu, and Quanquan Gu
    ByteDance Seed Tech Report, 2026
    * Equal contribution; † Project lead.
  2. Cheng-Yen Hsieh, Xinyou Wang, Daiheng Zhang, Dongyu Xue, Fei Ye, Shujian Huang, Zaixiang Zheng, and Quanquan Gu
    ICML, 2025 (Spotlight, Top 2.6% of submissions)
    Design choices are essential: Our designs enable the 650M multimodal PLM to outperform 3B-scale baselines and specialized structure folding models.
  3. Cheng-Yen HsiehKaihua ChenAchal DaveTarasha Khurana, and Deva Ramanan
    arXiv preprint, Nov 2023
    Our solution to unravel occlusion scenarios for any object—amodal tracking.
  4. Cheng-Yen Hsieh, Chih-Jung Chang, Fu-En Yang, and Yu-Chiang Frank Wang
    IEEE WACV, 2023
    One can easily adapt and fine-tune the models for a variety of applications including multi-label classification, object detection,and instance segmentation with this pre-training algorithm.
  5. Cheng-Yen Hsieh, Yu-Chuan Chuang, and An-Yeu (Andy) Wu
    IEEE 32nd International Workshop on Machine Learning for Signal Processing (MLSP), 2022
    Split Learning (SL) for efficient image recognition through dimension-wise compression.
  6. Cheng-Yen Hsieh, Yu-Chuan Chuang, and An-Yeu (Andy) Wu
    IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems (AICAS), 2021
    Highly efficienct image recognition under the federated learning (FL) scenario.