roboflow / rf-detr
RF-DETR is a real-time object detection model architecture developed by Roboflow, SOTA on COCO & designed for fine-tuning.
☆960Updated this week
Alternatives and similar repositories for rf-detr:
Users that are interested in rf-detr are comparing it to the libraries listed below
- YOLOE: Real-Time Seeing Anything☆870Updated this week
- streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL☆2,521Updated this week
- ☆1,360Updated last week
- YOLOv12: Attention-Centric Real-Time Object Detectors☆1,294Updated last week
- Create your custom OpenCV algorithms using a user-friendly node editor interface, inspired by Blender and Unreal Engine blueprints! Quic…☆354Updated last week
- Gaze-LLE: Gaze Target Estimation via Large-Scale Learned Encoders (CVPR 2025)☆533Updated this week
- [CVPR 2025] Learning Flow Fields in Attention for Controllable Person Image Generation☆1,421Updated last month
- Thera: Aliasing-Free Arbitrary-Scale Super-Resolution with Neural Heat Fields☆586Updated last week
- Recipes for shrinking, optimizing, customizing cutting edge vision models. 💜☆1,329Updated this week
- Transform PDFs into AI podcasts for engaging on-the-go audio content.☆599Updated this week
- Official Repo for "TheoremExplainAgent: Towards Multimodal Explanations for LLM Theorem Understanding"☆1,131Updated this week
- D-FINE: Redefine Regression Task of DETRs as Fine-grained Distribution Refinement [ICLR 2025 Spotlight]☆1,754Updated last week
- SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer☆3,796Updated this week
- Fast Real-time Object Detection with High-Res Output https://x.com/_akhaliq/status/1840213012818329826 https://x.com/githubprojects/statu…☆541Updated last week
- [CVPR 2025] Magma: A Foundation Model for Multimodal AI Agents☆1,481Updated last week
- An MIT License of YOLOv9, YOLOv7, YOLO-RD☆1,127Updated 2 weeks ago
- Portable KMS (knowledge management system) designed to integrate seamlessly with any Retrieval-Augmented Generation (RAG) system☆1,137Updated this week
- Make any LLM to think like OpenAI o1 and deepseek R1☆479Updated last month
- Real Time Speech Transcription with FastRTC ⚡️and Local Whisper 🤗☆538Updated last week
- Images to inference with no labeling (use foundation models to train supervised models).☆2,178Updated last week
- This series will take you on a journey from the fundamentals of NLP and Computer Vision to the cutting edge of Vision-Language Models.☆1,050Updated 2 months ago
- [CVPR 2025] DEIM: DETR with Improved Matching for Fast Convergence☆566Updated 2 weeks ago
- ☆2,437Updated last month
- This repository contains the code for a virtual try-on application built using Flask, Twilio's WhatsApp API, and Gradio's virtual try-on …☆338Updated 5 months ago
- Code of LHM: Large Animatable Human Reconstruction Model for Single Image to 3D in Seconds☆862Updated this week
- This repository is a curated collection of the most exciting and influential CVPR 2024 papers. 🔥 [Paper + Code + Demo]☆708Updated 9 months ago
- Arbitrary-steps Image Super-resolution via Diffusion Inversion (CVPR 2025)☆1,001Updated last week
- Official Implementation of "KBLaM: Knowledge Base augmented Language Model"☆487Updated 3 weeks ago
- MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.☆1,057Updated this week
- DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception☆979Updated 2 months ago