landing-ai / vision-agent
Vision agent
β1,293Updated this week
Related projects β
Alternatives and complementary repositories for vision-agent
- π€ MLE-Agent: Your intelligent companion for seamless AI engineering and research. π Integrate with arxiv and paper with code to provideβ¦β1,088Updated this week
- Deploy your agentic worfklows to productionβ1,822Updated this week
- Automated Design of Agentic Systemsβ1,020Updated this week
- Agent S: an open agentic framework that uses computers like a humanβ571Updated this week
- A framework for serving and evaluating LLM routers - save LLM costs without compromising quality!β3,220Updated 3 months ago
- Desktop app for prototyping and debugging LangGraph applications locally.β1,885Updated this week
- [NeurIPS'24] HippoRAG is a novel RAG framework inspired by human long-term memory that enables LLMs to continuously integrate knowledge aβ¦β1,352Updated 3 months ago
- Implementing the 4 agentic patterns from scratchβ728Updated 2 weeks ago
- π₯π₯ LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)β809Updated 4 months ago
- GraphRAG using Local LLMs - Features robust API and multiple apps for Indexing/Prompt Tuning/Query/Chat/Visualizing/Etc. This is meant toβ¦β1,698Updated this week
- π An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)β5,118Updated this week
- mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understandingβ1,519Updated last month
- The easiest way to use Agentic RAG in any enterpriseβ3,834Updated last week
- Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generationβ917Updated last week
- An open-source framework for collaborative AI agents, enabling diverse, distributed agents to team up and tackle complex tasks through inβ¦β587Updated 3 weeks ago
- Low code tool to rapidly build and coordinate multi-agent teamsβ821Updated last month
- One-click deploy of a Knowledge Graph powered RAG (GraphRAG) in Azureβ1,846Updated last week
- Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.β3,000Updated last month
- GPT4V-level open-source multi-modal model based on Llama3-8Bβ2,105Updated 2 months ago
- π¦βοΈ Did you say you like data?β1,038Updated 4 months ago
- A language model programming library.β5,226Updated this week
- A lightweight framework for building LLM-based agentsβ1,849Updated this week
- The code used to train and run inference with the ColPali architecture.β1,054Updated this week
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.β2,823Updated this week
- LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speeβ¦β2,538Updated last month
- Windows Agent Arena (WAA) πͺ is a scalable OS platform for testing and benchmarking of multi-modal AI agents.β462Updated this week
- Together Mixture-Of-Agents (MoA) β 65.1% on AlpacaEval with OSS modelsβ2,594Updated 3 weeks ago
- Recipes for shrinking, optimizing, customizing cutting edge vision models. πβ865Updated 2 months ago
- g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chainsβ3,864Updated last month
- Parse files for optimal RAGβ3,045Updated this week