microsoft / OmniParserLinks
A simple screen parsing tool towards pure vision based GUI agent
☆24,308Updated 4 months ago
Alternatives and similar repositories for OmniParser
Users that are interested in OmniParser are comparing it to the libraries listed below
Sorting:
- Toolkit for linearizing PDFs for LLM datasets/training☆16,833Updated this week
- The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra☆25,104Updated 3 weeks ago
- 🌐 Make websites accessible for AI agents. Automate tasks online with ease.☆77,442Updated this week
- Kortix – build, manage and train AI Agents.☆19,282Updated this week
- Driving all platforms UI automation with vision-based model☆11,532Updated this week
- Pioneering Automated GUI Interaction with Native Agents☆9,134Updated last week
- No fortress, purely open ground. OpenManus is Coming.☆54,001Updated last month
- 🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation☆19,004Updated this week
- 🪄 Create rich visualizations with AI☆14,789Updated last week
- Model Context Protocol Servers☆77,880Updated last week
- Turn websites into clean data pipelines & structured APIs in minutes!☆14,182Updated this week
- 🐫 CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org☆15,868Updated this week
- 🖥️ Run AI Agent in your browser.☆15,562Updated 5 months ago
- Like Manus, Computer Use Agent(CUA) and Omniparser, we are computer-using agents.AI-driven local automation assistant that uses natural l…☆3,824Updated 8 months ago
- Agent S: an open agentic framework that uses computers like a human☆9,671Updated 2 weeks ago
- A collection of MCP servers.☆79,856Updated last week
- Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.☆26,440Updated 3 weeks ago
- Playwright MCP server☆26,514Updated this week
- MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone☆22,701Updated 4 months ago
- Open-Source Chrome extension for AI-powered web automation. Run multi-agent workflows using your own LLM API key. Alternative to OpenAI O…☆12,104Updated 2 months ago
- The official Python SDK for Model Context Protocol servers and clients☆21,488Updated this week
- Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.☆13,162Updated this week
- A high-performance LLM inference API and Chat UI that integrates DeepSeek R1's CoT reasoning traces with Anthropic Claude models.☆5,370Updated 3 months ago
- Run frontier AI locally.☆40,998Updated this week
- screenpipe turns your computer into a personal AI that knows everything you've done. record. search. automate. all local, all private, al…☆16,679Updated this week
- OCR & Document Extraction using vision models☆12,041Updated 8 months ago
- An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.☆27,861Updated 4 months ago
- The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, No-code agent builder, MCP compatibility, and more.☆54,105Updated this week
- 🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming☆63,818Updated 2 weeks ago
- Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.☆53,776Updated this week