anyantudre/Florence-2-Vision-Language-Model

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/anyantudre/Florence-2-Vision-Language-Model)

anyantudre / Florence-2-Vision-Language-Model

Florence-2 is a novel vision foundation model with a unified, prompt-based representation for a variety of computer vision and vision-language tasks.

☆203

Alternatives and similar repositories for Florence-2-Vision-Language-Model

Users that are interested in Florence-2-Vision-Language-Model are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

CharlesCNorton / yoflo-gui
View on GitHub
Real-time object detection using Florence-2 with a user-friendly GUI.
☆31Aug 7, 2025Updated 11 months ago
retkowsky / florence-2
View on GitHub
Florence-2
☆72Feb 13, 2025Updated last year
andimarafioti / florence2-finetuning
View on GitHub
Quick exploration into fine tuning florence 2
☆340Sep 19, 2024Updated last year
SuXinqi / DAAD
View on GitHub
Offical code repository of ”DAAD: Dynamic Analysis and Adaptive Discriminator for Fake News Detection“
☆22Aug 22, 2024Updated last year
HeimingX / TAG
View on GitHub
Official code for Attention-driven GUI Grounding, AAAI2025
☆16Dec 17, 2024Updated last year
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
mbzuai-oryx / TrackingMeetsLMM
View on GitHub
☆10Apr 7, 2025Updated last year
iiakshat / football_yolo
View on GitHub
Football Match Analysis Using YOLO (You Only Look Once).
☆28Nov 19, 2024Updated last year
artemisp / LAVIS-XInstructBLIP
View on GitHub
LAVIS - A One-stop Library for Language-Vision Intelligence
☆48Aug 5, 2024Updated last year
arturxe2 / AdaSpot
View on GitHub
PyTorch Implementation of "AdaSpot: Spend Resolution Where It Matters for Precise Event Spotting"
☆27Mar 3, 2026Updated 4 months ago
LgQu / TIGeR
View on GitHub
Code for paper: Unified Text-to-Image Generation and Retrieval
☆16Jul 19, 2026Updated last week
pickxiguapi / Embodied-FSD
View on GitHub
Official code for "From Seeing to Doing: Bridging Reasoning and Decision for Robotic Manipulation" (ICLR2026)
☆37Mar 1, 2026Updated 4 months ago
Wenhao-Sun77 / Just-in-Time
View on GitHub
Official implementation of the paper "Just-in-Time: Training-Free Spatial Acceleration for Diffusion Transformers" (CVPR 2026)
☆30Jun 30, 2026Updated 3 weeks ago
IDEA-Research / Grounded-SAM-2
View on GitHub
Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2
☆3,660Nov 11, 2025Updated 8 months ago
xmed-lab / NuInstruct
View on GitHub
☆72Aug 12, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
autodistill / autodistill-florence-2
View on GitHub
Use Florence 2 to auto-label data for use in training fine-tuned object detection models.
☆68Aug 15, 2024Updated last year
mbzuai-oryx / DriveLMM-o1
View on GitHub
Reasoning DriveLMM
☆15Mar 15, 2025Updated last year
MrSecant / OmniVTA
View on GitHub
OmniVTA: Visuo-Tactile World Modeling for Contact-Rich Robotic Manipulation
☆58Mar 25, 2026Updated 4 months ago
IDEA-Research / ChatRex
View on GitHub
Code for ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
☆216Oct 15, 2025Updated 9 months ago
Han1018 / ZonUI-3B
View on GitHub
[WACV 2026] ZonUI-3B — A lightweight, resolution-aware GUI grounding model trained with only 24K samples on a single RTX 4090.
☆26Jan 2, 2026Updated 6 months ago
HorizonRobotics / nuplan-devkit
View on GitHub
☆13Feb 5, 2025Updated last year
mac999 / scan_to_bim_pipeline
View on GitHub
scan to bim pipieline including DTM, segmentation using deep learning
☆51Jul 12, 2026Updated 2 weeks ago
sjc042 / gta-link
View on GitHub
☆90Dec 12, 2025Updated 7 months ago
IDEA-Research / GroundingDINO
View on GitHub
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
☆10,454Aug 12, 2024Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
iSEE-Laboratory / LLMDet
View on GitHub
(CVPR 2025 highlight✨) Official repository of paper "LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of La…
☆607Feb 4, 2026Updated 5 months ago
tavisshore / VICI
View on GitHub
[ACMMM UAVM 2025] 🌍🚗 VICI: VLM-Instructed Cross-view Image-localisation 📡🗺️
☆17Feb 4, 2026Updated 5 months ago
tu-darmstadt-ros-pkg / sdf_contact_estimation
View on GitHub
Accurate pose prediction based on Signed Distance Fields for mobile ground robots in rough terrain.
☆15Jun 6, 2026Updated last month
lichy2004 / GazeVLA
View on GitHub
☆39Apr 27, 2026Updated 3 months ago
penghao-wu / GUI_Reflection
View on GitHub
☆34Sep 19, 2025Updated 10 months ago
yeyimilk / LLMGeo
View on GitHub
LLMGeo: Benchmarking Large Language Models on Image Geolocation In-the-wild
☆16Oct 31, 2024Updated last year
jahongir7174 / YOLOv8-dfl
View on GitHub
YOLOv8 implementation with DFL using PyTorch
☆16Dec 31, 2024Updated last year
facebookresearch / dinov3
View on GitHub
Reference PyTorch implementation and models for DINOv3
☆11,033Jul 15, 2026Updated 2 weeks ago
neil-ab / clip-zs-prompting
View on GitHub
Using CLIP for zero-shot learning and image classification with text & visual prompting.
☆15Dec 13, 2022Updated 3 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
JIA-Lab-research / Seg-Zero
View on GitHub
Project Page For "Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement"
☆636Jan 17, 2026Updated 6 months ago
liuxuannan / MMFakeBench
View on GitHub
[ICLR 2025] MMFakeBench: A Mixed-Source Multimodal Misinformation Detection Benchmark for LVLMs
☆55Mar 25, 2025Updated last year
AsishMandoi / VRP-explorations
View on GitHub
Exploring various quantum annealing-based approaches to solve the vehicle routing problem as part of the QOSF Quantum Computing Mentorshi…
☆14Aug 9, 2024Updated last year
metriccoders / ai-studio
View on GitHub
AI Studio by Metric Coders: A No-Code Software to train, download and deploy Large Language Models.
☆12Jul 5, 2024Updated 2 years ago
sdbds / florence2-ft-advanced
View on GitHub
finetune your florence2 model easy
☆21Jul 27, 2024Updated 2 years ago
facebookresearch / sam3
View on GitHub
The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading t…
☆11,109Updated this week
QwenLM / Qwen3-VL
View on GitHub
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
☆19,682Jan 30, 2026Updated 5 months ago