SkalskiP / YOLO-WorldLinks

Real-Time Open-Vocabulary Object Detection

☆13

Alternatives and similar repositories for YOLO-World

Users that are interested in YOLO-World are comparing it to the libraries listed below

Sorting:

UnisonAIInc / UnisonAI
The UnisonAI Multi-Agent Framework (A2A) provides a flexible and extensible environment for creating and coordinating multiple autonomous…
☆17Updated last week
SkalskiP / segment-anything-2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained mode…
☆13Updated 11 months ago
The-Swarm-Corporation / agentverse
Various agents from all of the top agent frameworks to integrate into swarms! Langchain, Griptape, CrewAI, and more!
☆12Updated this week
roboflow / cog-vlm-client
Simple CogVLM client script
☆14Updated last year
roboflow / roboflow-collect
Passively collect images for computer vision datasets on the edge.
☆34Updated last year
enricd / st_llms_arena
Streamlit app presented to the Streamlit LLMs Hackathon September 23
☆16Updated last year
camenduru / bria-rmbg-jupyter
☆16Updated last year
roboflow / clip_video_app
Flask-based web application designed to compare text and image embeddings using the CLIP model.
☆22Updated last year
camenduru / MoE-LLaVA-jupyter
☆16Updated last year
camenduru / MiniGPT-v2-colab
☆29Updated last year
autodistill / autodistill-florence-2
Use Florence 2 to auto-label data for use in training fine-tuned object detection models.
☆64Updated 11 months ago
OvidijusParsiunas / web-llm
Bringing large-language models and chat to web browsers. Everything runs inside the browser with no server support.
☆14Updated last year
roboflow / inference-client
☆14Updated last year
camenduru / Multi-LoRA-Composition-jupyter
☆13Updated last year
The-Swarm-Corporation / AgentParse
AgentParse is a high-performance parsing library designed to map various structured data formats (such as Pydantic models, JSON, YAML, an…
☆13Updated 2 weeks ago
kyegomez / Qwen-VL
My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't rel…
☆12Updated last year
TheOneTrueNiz / Gemini-in-NotebookLM
Use this code to access pipeline to Gemini from inside notebookLM
☆29Updated last year
camenduru / YoloWorld-EfficientSAM-jupyter
☆46Updated last year
mithunparab / text2segment_video
Simple Video Summarization using Text-to-Segment Anything (Florence2 + SAM2) This project provides a video processing tool that utilizes…
☆10Updated 4 months ago
camenduru / playground-colab
☆17Updated last year
camenduru / dreamtalk
Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models
☆15Updated last year
nengelmann / MeetingCam
Run your AI and CV algorithms in meetings such as Zoom, Meets or Teams! 🚀
☆14Updated last year
PirateforFreedom / luann
Luann (fka TypeAgent) allows you to create many LLM based agent(Various types of agent,scale up)
☆21Updated 2 months ago
camenduru / echomimic-jupyter
☆14Updated 7 months ago
enso-labs / llm-server
🤖 Open-source LLM server (OpenAI, Ollama, Groq, Anthropic) with support for HTTP, Streaming, Agents, RAG (Deprecated check out Orchestra…
☆32Updated last month
PromptEngineer48 / Function-Calling-Ollama-Llama-3.1
☆14Updated 9 months ago
oconnoob / realtime-stt-livekit-assemblyai
Add real-time Speech-to-Text to your LiveKit application with AssemblyAI
☆15Updated last month
isLinXu / vision-process-webui
💡💡💡awesome compute vision app in gradio
☆53Updated last year
varunsaagar / crawlwithagents
The Web Metadata Extraction Toolkit is designed to streamline the process of extracting, cleaning, and analyzing metadata from websites. …
☆17Updated last year
autodistill / autodistill-grounded-edgesam
EdgeSAM model for use with Autodistill.
☆27Updated last year