OpenVFM / V-SWIFT
V-SWIFT: Training a Small VideoMAE Model on a Single Machine in a Day
☆28Updated 2 months ago
Alternatives and similar repositories for V-SWIFT:
Users that are interested in V-SWIFT are comparing it to the libraries listed below
- Video Benchmark Suite: Rapid Evaluation of Video Foundation Models☆15Updated 3 months ago
- ☆17Updated 8 months ago
- Official implementation of OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion☆306Updated last month
- ☆116Updated last year
- A Light-Weight Framework for Open-Set Object Detection with Decoupled Feature Alignment in Joint Space☆82Updated 3 months ago
- YOLO-UniOW: Efficient Universal Open-World Object Detection☆114Updated 3 months ago
- Train InternViT-6B in MMSegmentation and MMDetection with DeepSpeed☆88Updated 5 months ago
- ☆73Updated 5 months ago
- Large scale image dataset visiualization tool.☆119Updated last year
- [CVPR2024] Generative Region-Language Pretraining for Open-Ended Object Detection☆167Updated 3 weeks ago
- Code for ChatRex: Taming Multimodal LLM for Joint Perception and Understanding☆182Updated 2 months ago
- This project aims to explore the deployment of Swin-Transformer based on TensorRT, including the test results of FP16 and INT8.☆165Updated 2 years ago
- Project for "HyperSeg: Towards Universal Visual Segmentation with Large Language Model".☆125Updated 4 months ago
- The official repo for ECCV'22 paper: Pose for Everything: Towards Category-Agnostic Pose Estimation☆208Updated 10 months ago
- Official implementation of 🛸 "UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface"☆168Updated 2 weeks ago
- A CPU Realtime VLM in 500M. Surpassed Moondream2 and SmolVLM. Training from scratch with ease.☆180Updated last month
- An Open Implementation of Motion Brush like Gen-2☆15Updated last year
- (CVPR 2025 highlight✨) Official repository of paper "LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of La…☆147Updated last week
- Fine tuning grounding Dino☆97Updated 3 months ago
- try to export sam2 to onnx.☆47Updated 6 months ago
- ☆94Updated last year
- 使用onnxruntime部署GroundingDINO开放世界目标检测,包含C++和Python两个版本的程序☆55Updated last year
- Accelerate segment anything model inference using Tensorrt 8.6.1.6☆89Updated last year
- Official repo of Griffon series including v1(ECCV 2024), v2, and G☆189Updated 3 weeks ago
- This repo contains the code for our paper Towards Open-Ended Visual Recognition with Large Language Model☆95Updated 9 months ago
- ☆179Updated last week
- ☆22Updated last year
- [ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization☆559Updated 10 months ago
- [NeurIPS2023] DatasetDM:Synthesizing Data with Perception Annotations Using Diffusion Models☆318Updated last year
- An Improved One millisecond Mobile Backbone☆145Updated 2 years ago