I am a final-year Ph.D. student at University of Technology Sydney, advised by Prof. Xiaojun Chang. I also work closely with Heng Wang, Linjie Yang, and Xiaojie Jin on various video-language projects at Bytedance.
Before moving to UTS, I spent a wonderful two years in Monash University. Prior to my candidature, I was a visiting student at MMLab, SIAT, Chinese Academy of Sciences, where I was fortunate to work with Prof. Yu Qiao, and Prof. Yali Wang.
I received my Master's degree from University of Chinese Academy of Sciences (UCAS) and my Bachelor's degree from Nankai University (NKU) with graduate honours.
Recent Activities
- 🌟🌟 I am currently on the job market for a research position. 🌟🌟
- CVPR 2024 PMV-400: Portrait-mode videos rock the social media! We have developed the first video dataset dedicated to the research of this emerging video format.
- Shot2Story: We have released this new video description dataset. With the assistance of LLM, our method achieves SOTA performance on zero-shot MSRVTT-QA.
- ICCV 2023: One paper on language referring video object segmentation gets accepted.
- NeurIPS 2023: One paper on efficient video segmentation gets accepted.
Research interest
My research interests lie in computer vision and machine learning. Currently, I am focusing on large vision-language models and their application in robotics. I worked on video-language downstream tasks related to object and event prediction in videos, like Referring-VOS and video grounding. Previously, I worked on individual and group activity recognition, and video object detection with full and limited supervision. During my Master's thesis, I worked on moving object detection and tracking.
Publications and preprints
Mingfei Han, Linjie Yang, Xiaojun Chang and Heng Wang
We present a new multi-shot video understanding benchmark Shot2Story20K with detailed shot-level captions and comprehensive video summaries. 2023
project page / paper / demo / code / data / video / bibtex
Mingfei Han, Linjie Yang, Xiaojie Jin, Jiashi Feng, Xiaojun Chang and Heng Wang
We have developed the first dataset dedicated to portrait mode videos and focus on the research of this emerging video format. CVPR 2024
project page / paper / data / bibtex
Yuetian Weng, Mingfei Han, Haoyu He, Mingjie Li, Lina Yao, Xiaojun Chang and Bohan Zhuang
Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS), 2023
code (to be released) / pdf / bibtex
Mingfei Han, Yali Wang, Zhihui Li, Lina Yao, Xiaojun Chang and Yu Qiao
International Conference on Computer Vision (ICCV), 2023
project page / pdf / poster / bibtex
Mingfei Han, David Junhao Zhang, Yali Wang, Ruiyan, Lina Yao, Xiaojun Chang and Yu Qiao
Conference on Computer Vision and Pattern Recognition (CVPR), 2022 (Oral)
project page / arXiv / slides / poster / presentation / bibtex
Mingfei Han, Yali Wang, Mingjie Li, Xiaojun Chang, Yi Yang and Yu Qiao
IEEE Transactions on Image Processing (TIP), 2021
Mingfei Han, Yali Wang, Xiaojun Chang, Yu Qiao
European Conference on Computer Vision (ECCV), 2020
ECVA / code / bibtex
Shiyu Xuan, Shengyang Li, Mingfei Han, Xue Wan, Gui-song Xia
IEEE Transactions on Geoscience and Remote Sensing (TGRS), 2019
IEEE / code / bibtex
Xiaojun Chang, Wenhe Liu, Po-Yao Huang, Changlin Li, Fengda Zhu, Mingfei Han, et al.
First Prize on Trecvid Activities in Extended Video (ActEV) challenge, 2019
NIST / bibtex
Talks
- China Society of Image and Graphics - Guangdong Branch in Chinese, "CSIG-Guangdong CVPR Papers sharing - Dual-AI: Dual-path Actor Interaction Learning for Group Activity Recognition", May 2022
- Jishi Live in Chinese, with my firend Xiangtao Kong who's on Low-level Vision and Super-Resolution, "CAS-SIAT CVPR Papers sharing - Dual-AI: Dual-path Actor Interaction Learning for Group Activity Recognition", with recording here, April 2022
- ML and VL Seminar at Monash University, "Mining Inter-Video Proposal Relations for Video Object Detection", November 2020