Vision-Guided Quadrupedal Locomotion in the Wild with Multi-Modal Delay Randomization[in submission]
Published in IROS 2022, 2025
Recommended citation: Zhang, Yuchen, Nikhil Keetha, Chenwei Lyu, Bhuvan Jhamb, Yutian Chen, Yuheng Qiu, Jay Karhade et al. "UFM: A Simple Path towards Unified Dense Correspondence with Flow." arXiv preprint arXiv:2506.09278 (2025). https://uniflowmatch.github.io/
Author: Zhang, Yuchen, Nikhil Keetha, Chenwei Lyu, Bhuvan Jhamb, Yutian Chen, Yuheng Qiu, Jay Karhade, Shreyas Jha, Yaoyu Hu, Deva Ramanan, Sebastian Scherer & Wenshan Wang (2025)
UFM employs a simple end-to-end transformer architecture. It first encode both images with DINOv2, and then process the concatenated features with self-attention layers. The model then regresses the (u,v) flow image and covisibility prediction through DPT heads. We trained the model on a combined dataset from 12 optical flow and wide-baseline matching datasets, showing mutual improvement on both tasks.
Please refer to the website for further information: https://uniflowmatch.github.io/.