Kevin Qinghong Lin

Ph.D. Student

Show Lab
National University of Singapore

Email: kevin.qh.lin [at] gmail.com


Photo taken on Rottnest Island.

Biography

Hi there! I am a second-year Ph.D. student in Show Lab @ NUS, working with Prof. Mike Shou.

My research interests lie in Vision and Language, especially Video Understanding and Large Language Models.

News

Publications

VisorGPT: Learning Visual Prior via Generative Pre-Training
Jinheng Xie, Kai Ye, Yudong Li, Yuexiang Li, Kevin QH. Lin, Yefeng Zheng, Linlin Shen, Mike Z. Shou.

Neural Information Processing Systems (NeurIPS), 2023.
[project] [paper] [code]

UniVTG: Towards Unified Video-Language Temporal Grounding
Kevin QH. Lin, Pengchuan Zhang, Joya Chen, Shraman Pramanick, Difei Gao, Alex JP. Wang, Rui Yan, Mike Z. Shou.

International Conference on Computer Vision (ICCV), 2023.
[demo] [paper] [code]
The first video temporal grounding pretraining model, unifying diverse temporal annotations to power moment retrieval, highlight detection and video summarization.

EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone
Shraman Pramanick, Yale Song, Sayan Nag, Kevin QH. Lin, Hardik Shah, Mike Z. Shou, Rama Chellappa, Pengchuan Zhang.

International Conference on Computer Vision (ICCV), 2023.
[project] [paper] [code]
The second generation of egocentric video-language pre-training.

Too Large; Data Reduction for Vision-Language Pre-Training
Alex JP. Wang, Kevin QH. Lin, David JH. Zhang, Stan WX. Lei, Mike Z. Shou.

International Conference on Computer Vision (ICCV), 2023.
[paper] [code]

Affordance Grounding from Demonstration Video to Target Image
Joya Chen, Difei Gao, Kevin QH. Lin, Mike Z. Shou.

Computer Vision and Pattern Recognition (CVPR), 2023.
[paper] [code]

All in one: Exploring unified video-language pre-training
Alex JP. Wang, Yixiao Ge, Rui Yan, Yuying Ge, Kevin QH. Lin, Satoshi Tsutsui, Xudong Lin, Guanyu Cai, Jianping Wu, Ying Shan, Xiaohu Qie, Mike Z. Shou.

Computer Vision and Pattern Recognition (CVPR), 2023.
[paper] [code]

Egocentric Video-Language Pretraining
Kevin QH. Lin, Alex JP. Wang, M. Soldan, M. Wray, R. Yan, Eric ZC. Xu, D. Gao, R. Tu, W. Zhao, W. Kong, C. Cai, H. Wang, D. Damen, B. Ghanem, W. Liu, Mike Z. Shou.

Neural Information Processing Systems (NeurIPS), 2022. Spotlight (1.7%)
[project] [paper] [code] [poster]
The first egocentric vision-language pretrained model. Double champions in Ego4D 2022 & Epic-Kitchens CVPR 2022 challenges. [News]

Projects

VLog: Video as a Long Document

[demo] [code] [twitter]
Given a long video, we turn it into a doc containing visual + audio info. By sending this doc to ChatGPT, we can chat over the video!

Honors

Service

Acknowledgment

I have been fortunate to work with these wonderful people who generously provided me with mentorship.

@ Tencent

Dr. Wei Liu

@ University of Bristol

Prof. Dima Damen
Prof. Michael Wray


@ KAUST

Prof. Bernard Ghanem
Mattia Soldan

@ Meta AI

Dr. Pengchuan Zhang
Dr. Xide Xia


Flag Counter

© Kevin