Gang Li

I am currently a Ph.D. candidate at the Institute of Software, Chinese Academy of Sciences (CAS), working on Self-supervised Learning and Diffusion Models. In June 2025, I will obtain my Ph.D. degree from Institute of Software, Chinese Academy of Sciences. I am pleased to announce that I will be joining Honor’s MagicOS division as a Senior Computer Vision Algorithm Engineer, working within the Smart Vision Platform team. Before that, I worked as an algorithm intern at Kuaishou MMU and JD Explore Academy. My primary focus will be on research and development in multimodal understanding and AIGC (AI-Generated Content) technologies. Our team is currently seeking both interns and full-time researchers/engineers. If you’re interested in opportunities in these cutting-edge fields, please feel free to share your resume with me at ucasligang[at]gmail[dot]com.

My current research interests center on multimodal understanding and AIGC (Artificial Intelligence Generated Content). Specific research interests include, but are not limited to:

Large multi-modal models.

Controllable generative diffusion models.

Efficient training and sampling of generative diffusion models.

Before this, I focused on the directions of Masking Image Modeling (MIM) and Vision Transformers (ViTs). By the way, we have compiled a repository on GitHub featuring a highly popular (300+ stars) paper list on Masked Image Modeling (MIM). The link is available at awesome-MIM.

Curriculum Vitae

Time	Title	Institution	Research Direction
2019.09 - Now	Ph.D. Candidate	Institute of Software, CAS	Self-supervised Learning、Diffusion Models
2015.09 - 2019.06	Bachelor	HeNan University	Network Engineering

Selected Papers [Full paper list at Google Scholar]

Preprint

FreeStyle: Free Lunch for Text-guided Style Transfer using Diffusion Models [pdf] [Project page]
Feihong He, Gang Li, Mengyuan Zhang, Leilei Yan, Lingyu Si, Fanzhang Li.
arXiv:2401.15636.
3ddesigner: Towards photorealistic 3d object generation and editing with text-guided diffusion models. [pdf] [Project page]
Gang Li, Heliang Zheng, Chaoyue Wang, Chang Li, Changwen Zheng, Dacheng Tao.
arXiv:2211.14108.

Publication

Semmae: Semantic-guided masking for learning masked autoencoders. [pdf] [code]
Gang Li, Heliang Zheng, Daqing Liu, Chaoyue Wang, Bing Su, Changwen Zheng.
Neural Information Processing Systems (NeurIPS), 2022. CCF-A .
DreamBooth++: Boosting Subject-Driven Generation via Region-Level References Packing. [pdf] Zhongyi Fan, Zixin Yin, Gang Li, Yibing Zhan, Heliang Zheng.
Proceedings of the 32nd ACM International Conference on Multimedia (ACM MM), 2024. CCF-A .
Simvit: Exploring a simple vision transformer with sliding windows [pdf] [code]
Gang Li, Di Xu, Xing Cheng, Lingyu Si, Changwen Zheng.
IEEE International Conference on Multimedia and Expo (ICME), 2022. CCF-B Oral.
Transductive distribution calibration for few-shot learning [pdf]
Gang Li, Changwen Zheng, Bing Su.
Neurocomputing (**), 2022. CCF-C、SCI .
Cartoondiff: Training-free Cartoon Image Generation with Diffusion Transformer Models. [pdf] [Project page] [Code]
Feihong He, Gang Li, Lingyu Si, Leilei Yan, Shimeng Hou, Hongwei Dong, Fanzhang Li.
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024. CCF-B.

Honors and Awards

First-class academic scholarship of Chinese Academy of Sciences, 2024.
Merit student of University of Chinese Academy of Sciences (UCAS), 2023, 2024.
Outstanding member of the Communist Youth League of the University of Chinese Academy of Sciences (UCAS), 2020.
National Encouragement Scholarship, 2018.
National Scholarship for undergraduate students, 2017.

Gang Li (李港)

Curriculum Vitae

Selected Papers [Full paper list at Google Scholar]

Preprint

Publication

Honors and Awards