I am an Associate Professor of Artificial Intelligence at Fudan University and a member of the Fudan Vision and Learning (FVL) Lab and Fudan Institute of Trustworthy Embodied AI (TEAI). I also serve as an Honorary Fellow at the University of Melbourne, Australia. My primary research area is Trustworthy AI, with a focus on developing secure, robust, explainable, privacy-preserving, and fair learning algorithms and models for broad AI applications. Beyond research, I am deeply passionate about using AI to enhance our understanding of both the mind and the universe.
I received my Ph.D. from the University of Melbourne, where I also spent two wonderful years as a postdoctoral research fellow. Before joining Fudan, I worked as a lecturer at Deakin University for 1.5 years. I hold a bachelor's degree from Jilin University and a master's degree from Tsinghua University.
"Everything should be as simple as possible, but not simpler."
Email / Google Scholar / GitHub
Introducing OpenTAI: Advancing Trustworthy AI Through Open Collaboration Over the past two years, I’ve dedicated significant effort to building OpenTAI, an open platform designed to accelerate collaborative research in Trustworthy AI (TAI). Today, I’m thrilled to officially launch this initiative.
Our Mission
OpenTAI aims to develop large-scale, practical, and open-source benchmarks, tools, and datasets—bridging the gap between TAI research and real-world applications. We’ve seeded the platform with foundational projects from our own work, but this is just the beginning.Call for Collaboration
OpenTAI is community-driven. We invite researchers and practitioners to:- Submit high-impact projects for curation (free and open to all)
- Collaborate on expanding our resource library
What’s Next?
Stay tuned for a pipeline of cutting-edge benchmarks and tools in the coming year—all designed to make AI more transparent, robust, and accountable.Join us in shaping the future of Trustworthy AI!
Books
- Endogenous Safety in Artificial Intelligence (《人工智能内生安全》)
- Artificial Intelligence: Data and Model Safety (《人工智能:数据与模型安全》)
We are actively seeking motivated Master's and Ph.D. students, postdoctoral researchers, and interns to join our team in the areas of Trustworthy AI, Multimodal/Vision-Language Models (MLLMs/VLMs), Generative AI, Reinforcement Learning, and Embodied AI. If you're interested, feel free to drop me an email.
News
- [06/2025] Three papers have been accepted to ICCV 2025.
- [05/2025] Our BackdoorLLM Benchmark was awarded first prize in the SafeBench Competition, organized by Center for AI Safety. Congratulations to all the authors on this outstanding achievement!
- [05/2025] Our work on super transferable attacks X-Transfer Attacks has been accepted to ICML 2025.
- [03/2025] The preprint of our long survey paper Safety at Scale: A Comprehensive Survey of Large Model Safety is available on arXiv. Many thanks to all collaborators!
- [02/2025] Our works on Million-scale Adversarial Robustness Evalution, Test-time Adversarial Prompt Tuning, and AnyAttack have been accepted to CVPR 2025.
- [01/2025] Our works on RL-based jailbreak defense for VLMs and backdoor sample detection in CLIP have been accepted to ICLR 2025.
- [12/2024] I will serve as an Area Chair for ICML 2025.
- [12/2024] Our works on targeted transferable adversarial attack, defense against model extraction attacks, and RL-based LLM auditing have been accepted to AAAI 2025.
- [09/2024] I will serve as an Area Chair for ICLR 2025.
- [09/2024] One paper on unlearnable examples for segmentation models has been accepted to NeurIPS 2024.
- [07/2024] Our works on model lock , detecting query-based adversarial attacks , and multimodal jailbreak attacks on VLMs have been accepted to MM 2024.
- [07/2024] Our work on adversarial prompt tuning has been accepted to accepted by ECCV 2024.
- [04/2024] Our work on intrinsic motivation for RL has been accepted to IJCAI 2024.
- [03/2024] Our work on adversarial policy learning in RL is accepted by DSN 2024.
- [03/2024] Our work on safety alignment of LLMs is accepted by NAACL 2024.
- [03/2024] Our work on machine unlearning has been accepted to TDSC.
- [01/2024] Our work on self-supervised learning have been accepted to ICLR 2024.
Research Interests
- Trustworthy AI
- Adversarial/jailbreak attacks and defenses
- Backdoor attacks and defenses
- Reinforcement learning, safety alignment
- Data privacy, data/model extraction
- Memorization, data attribution, unlearning
- Multimodal and Generative AI
- Multimodal learning, vision-language models
- Diffuson models, Text2Image generation, Text2Video generation
- World model, embodied AI
Professional Activities
- Program Committee Member
- ICLR (2019-2025), ICML (2019-2025), NeurIPS (2019-2024), CVPR (2020-2025), ICCV (2021-2023), ECCV (2020), AAAI (2020-2022), IJCAI (2020-2021), KDD (2019,2021), ICDM (2021), SDM (2021), AICAI (2021)
- Journal Reviewer
- Nature Communications, Pattern Recognition, TPAMI, TIP, IJCV, JAIR, TNNLS, TKDE, TIFS, TOMM, KAIS