publications

Please visit Google Scholar for the complete list of publications.

2025

ICLR

Exploiting Distribution Constraints for Scalable and Efficient Image Retrieval

Mohammad Omama, Po-han Li, and Sandeep P Chinchali

arXiv preprint arXiv:2410.07022, 2025

2024

ECCV

Towards Neuro-Symbolic Video Understanding

Minkyu Choi, Harsh Goel, Mohammad Omama, and 3 more authors

In Computer Vision – ECCV 2024, 2024

Abs

The unprecedented surge in video data production in recent years necessitates efficient tools to extract meaningful frames from videos for downstream tasks. Long-term temporal reasoning is a key desideratum for frame retrieval systems. While state-of-the-art foundation models, like VideoLLaMA and ViCLIP, are proficient in short-term semantic understanding, they surprisingly fail at long-term reasoning across frames. A key reason for this failure is that they intertwine per-frame perception and temporal reasoning into a single deep network. Hence, decoupling but co-designing the semantic understanding and temporal reasoning is essential for efficient scene identification. We propose a system that leverages vision-language models for semantic understanding of individual frames and effectively reasons about the long-term evolution of events using state machines and temporal logic (TL) formulae that inherently capture memory. Our TL-based reasoning improves the F1 score of complex event identification by 9–15%, compared to benchmarks that use GPT-4 for reasoning, on state-of-the-art self-driving datasets such as Waymo and NuScenes. The source code is available at https://github.com/UTAustin-SwarmLab/Neuro-Symbolic-Video-Search-Temporal-Logic.

2023

RSS

Conceptfusion: Open-set multimodal 3d mapping

Krishna Murthy Jatavallabhula, Alihusein Kuwajerwala, Qiao Gu, and 8 more authors

arXiv preprint arXiv:2302.07241, 2023

2022

IROS

Drift Reduced Navigation with Deep Explainable Features

Mohd Omama, Sripada V. S. Sundar, Sandeep Chinchali, and 2 more authors

In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2022

DOI
ACC

Ladfn: Learning actions for drift-free navigation in highly dynamic scenes

Mohd Omama, Sundar Sripada VS, Sandeep Chinchali, and 1 more author

In 2022 American Control Conference (ACC), 2022