neis-lab / mmcowsLinks
MmCows: A Multimodal Dataset for Dairy Cattle Monitoring
☆36Updated 3 months ago
Alternatives and similar repositories for mmcows
Users that are interested in mmcows are comparing it to the libraries listed below
Sorting:
- The suite of modeling video with Mamba☆278Updated last year
- Official repo of the paper "Object-aware Gaze Target Detection" (ICCV 2023)☆42Updated 9 months ago
- [ECCV 2024] Official PyTorch implementation of TC-CLIP "Leveraging Temporal Contextualization for Video Action Recognition"☆73Updated 7 months ago
- Sharingan: A Transformer Architecture for Multi-Person Gaze Following☆21Updated 10 months ago
- [NeurIPS 2023] Official implementation of the paper "CAST: Cross-Attention in Space and Time for Video Action Recognition"☆52Updated last year
- [WACV'25 Oral] Enhancing Zero-Shot Facial Expression Recognition by LLM Knowledge Transfer☆47Updated 7 months ago
- [CVPR 2025] "Towards Universal Soccer Video Understanding".☆184Updated 2 weeks ago
- [CVPR 24] The repository provides code for running inference and training for "Segment and Caption Anything" (SCA) , links for downloadin…☆228Updated 11 months ago
- [CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking☆680Updated 11 months ago
- Code for "LocLLM: Exploiting Generalizable Human Keypoint Localization via Large Language Model", CVPR 2024 Highlight☆54Updated last year
- [ICCV2025] Referring any person or objects given a natural language description. Code base for RexSeek and HumanRef Benchmark☆161Updated this week
- [CVPR 2023] Official repository of paper titled "Fine-tuned CLIP models are efficient video learners".☆291Updated last year
- OpenTAD is an open-source temporal action detection (TAD) toolbox based on PyTorch.☆287Updated 4 months ago
- [ACM MM23] CLIP-Count: Towards Text-Guided Zero-Shot Object Counting☆115Updated last year
- [CVPR25] Official repository for the paper: "SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation"☆324Updated this week
- [CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want☆844Updated 2 months ago
- ☆27Updated last year
- Official pytorch repository for CG-DETR "Correlation-guided Query-Dependency Calibration in Video Representation Learning for Temporal Gr…☆140Updated last year
- [CVPR 2024] Bridging the Gap: A Unified Video Comprehension Framework for Moment Retrieval and Highlight Detection☆106Updated last year
- Official code of "ViTGaze: Gaze Following with Interaction Features in Vision Transformers"☆58Updated 6 months ago
- Official repository for "Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition" [ICCV 2023]☆100Updated last year
- [CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts☆331Updated last year
- [ICCV2023 Oral] Unmasked Teacher: Towards Training-Efficient Video Foundation Models☆336Updated last year
- [CVPR'2024 Highlight] Official PyTorch implementation of the paper "VTimeLLM: Empower LLM to Grasp Video Moments".☆287Updated last year
- CVPR 2023 Accepted Paper HOICLIP: Efficient Knowledge Transfer for HOI Detection with Vision-Language Models☆69Updated last year
- 🌀 R2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding (ECCV 2024)☆88Updated last year
- [ECCV2024] VideoMamba: State Space Model for Efficient Video Understanding☆993Updated last year
- [CVPR2025] Number it: Temporal Grounding Videos like Flipping Manga☆121Updated 3 weeks ago
- [CVPR2023] Masked Video Distillation: Rethinking Masked Feature Modeling for Self-supervised Video Representation Learning (https://arxiv…☆134Updated 2 years ago
- Official Implementation of "Chrono: A Simple Blueprint for Representing Time in MLLMs"☆91Updated 6 months ago