Gang Li

I am currently a Ph.D. candidate at the Institute of Software, Chinese Academy of Sciences (CAS), working on Self-supervised Learning and Diffusion Models. In June 2025, I will obtain my Ph.D. degree from Institute of Software, Chinese Academy of Sciences. I am pleased to announce that I will be joining Honor’s MagicOS division as a Senior Computer Vision Algorithm Engineer, working within the Smart Vision Platform team. Before that, I worked as an algorithm intern at Kuaishou MMU and JD Explore Academy. My primary focus will be on research and development in multimodal understanding and AIGC (AI-Generated Content) technologies. Our team is currently seeking both interns and full-time researchers/engineers. If you’re interested in opportunities in these cutting-edge fields, please feel free to share your resume with me at ucasligang[at]gmail[dot]com.

My current research interests center on multimodal understanding and AIGC (Artificial Intelligence Generated Content). Specific research interests include, but are not limited to:

Large multi-modal models.

Controllable generative diffusion models.

Efficient training and sampling of generative diffusion models.

Before this, I focused on the directions of Masking Image Modeling (MIM) and Vision Transformers (ViTs). By the way, we have compiled a repository on GitHub featuring a highly popular (300+ stars) paper list on Masked Image Modeling (MIM). The link is available at awesome-MIM.

Curriculum Vitae

TimeTitleInstitutionResearch Direction
2019.09 - NowPh.D. CandidateInstitute of Software, CASSelf-supervised Learning、Diffusion Models
2015.09 - 2019.06BachelorHeNan UniversityNetwork Engineering

Selected Papers [Full paper list at Google Scholar]

Preprint

  • FreeStyle: Free Lunch for Text-guided Style Transfer using Diffusion Models [pdf] [Project page]
    Feihong He, Gang Li, Mengyuan Zhang, Leilei Yan, Lingyu Si, Fanzhang Li.
    arXiv:2401.15636.

  • 3ddesigner: Towards photorealistic 3d object generation and editing with text-guided diffusion models. [pdf] [Project page]
    Gang Li, Heliang Zheng, Chaoyue Wang, Chang Li, Changwen Zheng, Dacheng Tao.
    arXiv:2211.14108.

Publication

  • Semmae: Semantic-guided masking for learning masked autoencoders. [pdf] [code]
    Gang Li, Heliang Zheng, Daqing Liu, Chaoyue Wang, Bing Su, Changwen Zheng.
    Neural Information Processing Systems (NeurIPS), 2022. CCF-A .

  • DreamBooth++: Boosting Subject-Driven Generation via Region-Level References Packing. [pdf] Zhongyi Fan, Zixin Yin, Gang Li, Yibing Zhan, Heliang Zheng.
    Proceedings of the 32nd ACM International Conference on Multimedia (ACM MM), 2024. CCF-A .

  • Simvit: Exploring a simple vision transformer with sliding windows [pdf] [code]
    Gang Li, Di Xu, Xing Cheng, Lingyu Si, Changwen Zheng.
    IEEE International Conference on Multimedia and Expo (ICME), 2022. CCF-B Oral.

  • Transductive distribution calibration for few-shot learning [pdf]
    Gang Li, Changwen Zheng, Bing Su.
    Neurocomputing (**), 2022. CCF-C、SCI .

  • Cartoondiff: Training-free Cartoon Image Generation with Diffusion Transformer Models. [pdf] [Project page] [Code]
    Feihong He, Gang Li, Lingyu Si, Leilei Yan, Shimeng Hou, Hongwei Dong, Fanzhang Li.
    IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024. CCF-B.

Honors and Awards

  • First-class academic scholarship of Chinese Academy of Sciences, 2024.
  • Merit student of University of Chinese Academy of Sciences (UCAS), 2023, 2024.
  • Outstanding member of the Communist Youth League of the University of Chinese Academy of Sciences (UCAS), 2020.
  • National Encouragement Scholarship, 2018.
  • National Scholarship for undergraduate students, 2017.