roboflow / maestroLinks
streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL
β2,568Updated last week
Alternatives and similar repositories for maestro
Users that are interested in maestro are comparing it to the libraries listed below
Sorting:
- Recipes for shrinking, optimizing, customizing cutting edge vision models. πβ1,470Updated this week
- The easiest way to deploy agents, models, RAG, pipelines and more. No MLOps. No YAML.β3,176Updated this week
- ποΈ + π¬ + π§ = π€ Curated list of top foundation and multimodal models! [Paper + Code + Examples + Tutorials]β618Updated last year
- Must-have resource for anyone who wants to experiment with and build on the OpenAI vision API π₯β1,680Updated 4 months ago
- Colivara is a suite of services that allows you to store, search, and retrieve documents based on their visual embedding. ColiVara has stβ¦β1,124Updated last month
- Everything about the SmolLM2 and SmolVLM family of modelsβ2,460Updated 2 months ago
- This series will take you on a journey from the fundamentals of NLP and Computer Vision to the cutting edge of Vision-Language Models.β1,079Updated 4 months ago
- 4M: Massively Multimodal Masked Modelingβ1,727Updated this week
- The simplest, fastest repository for training/finetuning small-sized VLMs.β3,102Updated this week
- Turn any computer or edge device into a command center for your computer vision projects.β1,702Updated this week
- β1,785Updated last week
- Implementing the 4 agentic patterns from scratchβ1,345Updated 2 months ago
- Knowledge Agents and Management in the Cloudβ3,995Updated this week
- NVIDIA Ingest is an early access set of microservices for parsing hundreds of thousands of complex, messy unstructured PDFs and other entβ¦β2,680Updated this week
- YOLOE: Real-Time Seeing Anythingβ1,325Updated last month
- RF-DETR is a real-time object detection model architecture developed by Roboflow, SOTA on COCO & designed for fine-tuning.β2,196Updated this week
- Images to inference with no labeling (use foundation models to train supervised models).β2,283Updated 3 weeks ago
- This repository is a curated collection of the most exciting and influential CVPR 2024 papers. π₯ [Paper + Code + Demo]β722Updated this week
- Fast State-of-the-Art Static Embeddingsβ1,706Updated this week
- A unified library for object tracking featuring clean room re-implementations of leading multi-object tracking algorithmsβ1,743Updated this week
- β2,952Updated 8 months ago
- VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and clouβ¦β3,295Updated this week
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.β789Updated 4 months ago
- The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.β1,901Updated last week
- This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.β1,294Updated last month
- Vision-Augmented Retrieval and Generation (VARAG) - Vision first RAG Engineβ461Updated 4 months ago
- Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-β¦β3,483Updated 2 weeks ago
- AdalFlow: The library to build & auto-optimize LLM applications.β3,169Updated 2 months ago
- Document to Markdown OCR library with Llama 3.2 visionβ2,329Updated 4 months ago
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.β3,053Updated last week