AhmedZgaren / SaveLinks
☆31Updated 3 months ago
Alternatives and similar repositories for Save
Users that are interested in Save are comparing it to the libraries listed below
Sorting:
- [T-PAMI 2025] EMOv2: Pushing 5M Vision Model Frontier☆53Updated last year
- ☆69Updated last year
- VFM-Det: Towards High-Performance Vehicle Detection via Large Foundation Models☆37Updated 8 months ago
- ☆26Updated last year
- [ECCV'24 Workshops Oral] DALDA: Data Augmentation Leveraging Diffusion Model and LLM with Adaptive Guidance Scaling☆30Updated last year
- PyTorch Implementation of "ASTRA: An Action Spotting TRAnsformer for Soccer Videos", ACM MMSports 2023. | 3rd place solution for SoccerNe…☆41Updated last year
- ☆44Updated 2 months ago
- Use Segment Anything 2, grounded with Florence-2, to auto-label data for use in training vision models.☆134Updated last year
- [NeurIPS 2025] Elevating Visual Perception in Multimodal LLMs with Visual Embedding Distillation☆68Updated 2 months ago
- [ICCV2025] Referring any person or objects given a natural language description. Code base for RexSeek and HumanRef Benchmark☆177Updated 2 months ago
- [NeurIPS 2023] HASSOD: Hierarchical Adaptive Self-Supervised Object Detection☆58Updated last year
- Use Florence 2 to auto-label data for use in training fine-tuned object detection models.☆68Updated last year
- [CVPR 2025 Highlight] Official repository for the paper: "SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation"☆356Updated 3 months ago
- RobustSAM: Segment Anything Robustly on Degraded Images (CVPR 2024 Highlight)☆364Updated last year
- ☆19Updated 7 months ago
- VLM driven tool that processes surveillance videos, extracts frames, and generates insightful annotations using a fine-tuned Florence-2 V…☆125Updated 6 months ago
- Includes the VideoCount dataset and CountVid code for the paper Open-World Object Counting in Videos.☆83Updated 3 weeks ago
- Official implementation of "Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs".☆96Updated last month
- 3D Traffic Light & Sign Dataset☆21Updated 9 months ago
- [arXiv 2025] SAGE: Training Smart Any-Horizon Agents for Long Video Reasoning with Reinforcement Learning☆50Updated 2 weeks ago
- XmodelLM☆38Updated last year
- CVPR 2025 Workshop on CVEU.☆42Updated 6 months ago
- The Missing Point in Vision Transformers for Universal Image Segmentation☆56Updated last month
- ☆192Updated 7 months ago
- TensorFlow implementation of a comprehensive comparison of various SSL (Semi-Supervised Learning) approaches in image segmentation, featu…☆19Updated last year
- 🚀🚀🚀Official code for the paper "YOLO-Master: MOE-Accelerated with Specialized Transformers for Enhanced Real-time Detection."🔥🔥🔥☆42Updated this week
- [ICCV 2025] OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning☆412Updated last month
- EdgeSAM model for use with Autodistill.☆29Updated last year
- AAPL: Adding Attributes to Prompt Learning for Vision-Language Models (CVPRw 2024)☆34Updated last year
- Which model is the best at object detection? Which is best for small or large objects? We compare the results in a handy leaderboard.☆95Updated 2 weeks ago