Updates
Aug'25: Broadening Participation Award to attend ICCV 2025.
Jun'25: DisenQ (Fist-author paper) got accepted in ICCV 2025 as Highlight. ⭐️🔥
Mar'25: Depth and Height Perception Benchmark (First-author paper) got accepted in CVPR MMFM Workshop 2025.
Feb'25: HierarQ (First-author paper) got accepted in CVPR'25. 🔥
Apr'24: Diversity, Equity and Inclusion Award to attend CVPR 2024.
Mar'24: Distribution Shift Benchmark got accepted in CVPR MMFM Workshop 2024.
Mar'24: Compositional Reasoning Benchmark got accepted in CVPR MMFM Workshop 2024.
Feb'24: Activity-Biometrics (First-author paper) got accepted in CVPR'24. 🔥
|
Publications
Below is a selected list of my works (in chronological order), representative papers are highlighted.
|
|
Streaming Long-form Video Understanding With On-time Answering
Shehreen Azad, Vibhav Vineet, Yogesh Singh Rawat
Ongoing
Developing a novel memory-augmented framework to equip Multimodal Large Language Models (MLLMs) with real-time, continuous understanding of long-form streaming video.
|
|
DisenQ: Disentangling Q-Former for Activity-Biometrics
Shehreen Azad, Yogesh Singh Rawat
International Conference on Computer Vision (ICCV), 2025 Highlight
Patent pending
Project Page  / 
Paper
A novel disentanglement-based Multimodal Large Language Model (MLLM) based architecture for robust activity-aware person identification.
|
|
GeoMeter: Understanding Depth and Height Perception in Large Visual Language Models
Shehreen Azad, Yash Jain, Rishit Garg, Vibhav Vineet, Yogesh Singh Rawat
Computer Vision and Pattern Recognition Conference Workshops (CVPR Workshops), 2025
3rd Workshop on What is Next in Multimodal Foundation Models
Project Page  / 
Paper
The first diagnostic benchmark for specifically evaluating the depth and height perception capabilities of Multimodal Large Language Models (MLLMs).
|
|
HierarQ: Task-Aware Hierarchical Q-Former for Enhanced Video Understanding
Shehreen Azad, Vibhav Vineet, Yogesh Singh Rawat
Computer Vision and Pattern Recognition Conference (CVPR), 2025
Project Page  / 
Paper
A novel task-aware and hierarchical framework to equip Multimodal Large Language Models (MLLMs) for efficient arbitrarity long-form video understanding.
|
|
Robustness Analysis on Foundation Segmentation Models
Madeline Chantry Schiappa, Shehreen Azad, Sachidanand VS, Yunhao Ge, Ondrej Miksik, Vibhav Vineet, Yogesh Singh Rawat
Computer Vision and Pattern Recognition Conference Workshops (CVPR Workshops), 2024
2nd Workshop on What is Next in Multimodal Foundation Models
Paper  / 
Data
Investigated the robustness of Multimodal Foundation Models (MFMs) in the face of distribution shift caused by perturbations and corruptions across 17 different categories and 5 different severity levels.
|
|
Probing Conceptual Understanding of Large Visual Language Models
Madeline Chantry Schiappa, Raiyaan Abdullah, Shehreen Azad, Jared Claypoole, Michael Cogswell, Ajay Divakaran, Yogesh Singh Rawat
Computer Vision and Pattern Recognition Conference Workshops (CVPR Workshops), 2024
2nd Workshop on What is Next in Multimodal Foundation Models
Paper  / 
Data
Investigated and improved the relational, compositional and contextual understanding of Multimodal Large Language Models (MLLMs) through 3 novel benchmarks.
|
|
Activity-Biometrics: Person Identification from Daily Activities
Shehreen Azad, Yogesh Singh Rawat
Computer Vision and Pattern Recognition Conference (CVPR), 2025
Paper
Proposed the novel task of activity-biometrics, which scales traditional gait-based person reID methods to activity-aware person reID. Developed a novel disentanglement based framework to improve robust activity-aware person reID.
|
 |
ICCV Travel Grant 2025
3 times recipient of UCF CS Ranking Incentive Award
2 times recipient of UCF Presentation Fellowship Award
2nd place,
IARPA BRIAR: Biometric Recognition and Identification at Altitude and Range
CVPR Travel Grant 2024
UCF ORCGS Doctoral Fellowship, 2023-2024
|
 |
Reviewer, CVPR 2024, 2025, 2026
Reviewer, ICCV 2025
Reviewer, NeurIPS 2024, 2025
Reviewer, BMVC 2025
Reviewer, CVPR MMFM Workshop 2024, 2025
|
|