landing-ai / vision-agentLinks
Vision agent
☆5,111Updated 2 months ago
Alternatives and similar repositories for vision-agent
Users that are interested in vision-agent are comparing it to the libraries listed below
Sorting:
- The python library for real-time communication☆4,392Updated last month
- Fully local web research and report writing assistant☆8,328Updated 3 months ago
- Task-Aware Agent-driven Prompt Optimization Framework☆3,674Updated last month
- ☆9,482Updated 2 months ago
- NeMo Retriever extraction is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever extra…☆2,760Updated last week
- Knowledge Agents and Management in the Cloud☆4,198Updated last week
- Flexible and powerful framework for managing multiple AI agents and handling complex conversations☆7,054Updated 3 weeks ago
- streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL☆2,640Updated last week
- SoTA production-ready AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.☆7,439Updated last week
- KAG is a logical form-guided reasoning and retrieval framework based on OpenSPG engine and LLMs. It is used to build logical reasoning a…☆8,171Updated last month
- Toolkit for linearizing PDFs for LLM datasets/training☆15,909Updated last week
- RAG that intelligently adapts to your use case, data, and queries☆3,583Updated 2 weeks ago
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆8,004Updated 9 months ago
- No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents☆5,933Updated this week
- Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.☆7,127Updated 4 months ago
- [CVPR 2025] Magma: A Foundation Model for Multimodal AI Agents☆1,844Updated last month
- mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding☆2,260Updated 5 months ago
- Python SDK for AI agent monitoring, LLM cost tracking, benchmarking, and more. Integrates with most LLMs and agent frameworks including C…☆5,062Updated 2 weeks ago
- PIKE-RAG: sPecIalized KnowledgE and Rationale Augmented Generation☆2,312Updated 2 months ago
- Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement…☆7,817Updated this week
- Keep searching, reading webpages, reasoning until it finds the answer (or exceeding the token budget)☆4,979Updated last month
- A visual playground for agentic workflows: Iterate over your agents 10x faster☆5,588Updated 3 months ago
- AG2 (formerly AutoGen): The Open-Source AgentOS. Join us at: https://discord.gg/pAbnFJrkgZ☆3,799Updated this week
- Composable building blocks to build Llama Apps☆8,156Updated this week
- RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry☆4,284Updated 2 months ago
- File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.☆7,221Updated 8 months ago
- A powerful framework for building realtime voice AI agents 🤖🎙️📹☆8,218Updated this week
- Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.☆12,323Updated last month
- [NeurIPS'24] HippoRAG is a novel RAG framework inspired by human long-term memory that enables LLMs to continuously integrate knowledge a…☆2,938Updated 2 months ago
- Python library for Agentic Document Extraction from LandingAI☆2,146Updated 3 weeks ago