Gang Li
I am currently a Ph.D. candidate at the Institute of Software, Chinese Academy of Sciences (CAS), working on Self-supervised Learning and Diffusion Models. In June 2025, I will obtain my Ph.D. degree from Institute of Software, Chinese Academy of Sciences. I am pleased to announce that I will be joining Honor’s MagicOS division as a Senior Computer Vision Algorithm Engineer, working within the Smart Vision Platform team. Before that, I worked as an algorithm intern at Kuaishou MMU and JD Explore Academy. My primary focus will be on research and development in multimodal understanding and AIGC (AI-Generated Content) technologies. Our team is currently seeking both interns and full-time researchers/engineers. If you’re interested in opportunities in these cutting-edge fields, please feel free to share your resume with me at ucasligang[at]gmail[dot]com.
My current research interests center on multimodal understanding and AIGC (Artificial Intelligence Generated Content). Specific research interests include, but are not limited to:
Large multi-modal models.
Controllable generative diffusion models.
Efficient training and sampling of generative diffusion models.
Before this, I focused on the directions of Masking Image Modeling (MIM) and Vision Transformers (ViTs). By the way, we have compiled a repository on GitHub featuring a highly popular (300+ stars) paper list on Masked Image Modeling (MIM). The link is available at awesome-MIM.
Curriculum Vitae
Time | Title | Institution | Research Direction |
---|---|---|---|
2019.09 - Now | Ph.D. Candidate | Institute of Software, CAS | Self-supervised Learning、Diffusion Models |
2015.09 - 2019.06 | Bachelor | HeNan University | Network Engineering |
Selected Papers [Full paper list at Google Scholar]
Preprint
FreeStyle: Free Lunch for Text-guided Style Transfer using Diffusion Models [pdf] [Project page]
Feihong He, Gang Li, Mengyuan Zhang, Leilei Yan, Lingyu Si, Fanzhang Li.
arXiv:2401.15636.3ddesigner: Towards photorealistic 3d object generation and editing with text-guided diffusion models. [pdf] [Project page]
Gang Li, Heliang Zheng, Chaoyue Wang, Chang Li, Changwen Zheng, Dacheng Tao.
arXiv:2211.14108.
Publication
Semmae: Semantic-guided masking for learning masked autoencoders. [pdf] [code]
Gang Li, Heliang Zheng, Daqing Liu, Chaoyue Wang, Bing Su, Changwen Zheng.
Neural Information Processing Systems (NeurIPS), 2022. CCF-A .DreamBooth++: Boosting Subject-Driven Generation via Region-Level References Packing. [pdf] Zhongyi Fan, Zixin Yin, Gang Li, Yibing Zhan, Heliang Zheng.
Proceedings of the 32nd ACM International Conference on Multimedia (ACM MM), 2024. CCF-A .Simvit: Exploring a simple vision transformer with sliding windows [pdf] [code]
Gang Li, Di Xu, Xing Cheng, Lingyu Si, Changwen Zheng.
IEEE International Conference on Multimedia and Expo (ICME), 2022. CCF-B Oral.Transductive distribution calibration for few-shot learning [pdf]
Gang Li, Changwen Zheng, Bing Su.
Neurocomputing (**), 2022. CCF-C、SCI .Cartoondiff: Training-free Cartoon Image Generation with Diffusion Transformer Models. [pdf] [Project page] [Code]
Feihong He, Gang Li, Lingyu Si, Leilei Yan, Shimeng Hou, Hongwei Dong, Fanzhang Li.
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024. CCF-B.
Honors and Awards
- First-class academic scholarship of Chinese Academy of Sciences, 2024.
- Merit student of University of Chinese Academy of Sciences (UCAS), 2023, 2024.
- Outstanding member of the Communist Youth League of the University of Chinese Academy of Sciences (UCAS), 2020.
- National Encouragement Scholarship, 2018.
- National Scholarship for undergraduate students, 2017.