arXiv Analytics

Sign in

arXiv:2406.13002 [cs.CV]AbstractReferencesReviewsResources

Recurrence over Video Frames (RoVF) for the Re-identification of Meerkats

Mitchell Rogers, Kobe Knowles, Gaël Gendron, Shahrokh Heidari, David Arturo Soriano Valdez, Mihailo Azhar, Padriac O'Leary, Simon Eyre, Michael Witbrock, Patrice Delmas

Published 2024-06-18Version 1

Deep learning approaches for animal re-identification have had a major impact on conservation, significantly reducing the time required for many downstream tasks, such as well-being monitoring. We propose a method called Recurrence over Video Frames (RoVF), which uses a recurrent head based on the Perceiver architecture to iteratively construct an embedding from a video clip. RoVF is trained using triplet loss based on the co-occurrence of individuals in the video frames, where the individual IDs are unavailable. We tested this method and various models based on the DINOv2 transformer architecture on a dataset of meerkats collected at the Wellington Zoo. Our method achieves a top-1 re-identification accuracy of $49\%$, which is higher than that of the best DINOv2 model ($42\%$). We found that the model can match observations of individuals where humans cannot, and our model (RoVF) performs better than the comparisons with minimal fine-tuning. In future work, we plan to improve these models by using pre-text tasks, apply them to animal behaviour classification, and perform a hyperparameter search to optimise the models further.

Comments: Presented as a poster at the CV4Animals Workshop, CVPR 2024
Categories: cs.CV
Related articles: Most relevant | Search more
arXiv:1707.07150 [cs.CV] (Published 2017-07-22)
Multi-Oriented Text Detection and Verification in Video Frames and Scene Images
arXiv:2211.12627 [cs.CV] (Published 2022-11-22)
$β$-Multivariational Autoencoder for Entangled Representation Learning in Video Frames
arXiv:2305.01443 [cs.CV] (Published 2023-05-02)
Scalable Mask Annotation for Video Text Spotting