microsoft / OmniParserLinks
A simple screen parsing tool towards pure vision based GUI agent
☆24,041Updated 3 months ago
Alternatives and similar repositories for OmniParser
Users that are interested in OmniParser are comparing it to the libraries listed below
Sorting:
- 🌐 Make websites accessible for AI agents. Automate tasks online with ease.☆73,975Updated this week
- The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra☆19,918Updated last week
- ☆8,565Updated last month
- 🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation☆18,474Updated 3 weeks ago
- 🖥️ Run AI Agent in your browser.☆15,336Updated 3 months ago
- 🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN☆57,438Updated this week
- Toolkit for linearizing PDFs for LLM datasets/training☆16,203Updated last week
- No fortress, purely open ground. OpenManus is Coming.☆51,309Updated last month
- Autonomous coding agent right in your IDE, capable of creating/editing files, executing commands, using the browser, and more with your p…☆56,239Updated this week
- MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone☆22,426Updated 2 months ago
- Automate browser based workflows with AI☆19,839Updated this week
- An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.☆27,725Updated 2 months ago
- Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.☆12,694Updated 2 months ago
- 🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data☆69,981Updated this week
- Driving all platforms UI automation with vision-based model☆10,900Updated this week
- OCR & Document Extraction using vision models☆11,992Updated 7 months ago
- A lightweight, powerful framework for multi-agent workflows☆17,812Updated this week
- Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing a…☆32,750Updated this week
- Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work t…☆41,609Updated this week
- AI app store powered by 24/7 desktop history. open source | 100% local | dev friendly | 24/7 screen, mic recording☆16,170Updated last week
- Like Manus, Computer Use Agent(CUA) and Omniparser, we are computer-using agents.AI-driven local automation assistant that uses natural l…☆3,781Updated 7 months ago
- Build Real-Time Knowledge Graphs for AI Agents☆21,187Updated this week
- The AI Browser Automation Framework☆19,544Updated this week
- Agent S: an open agentic framework that uses computers like a human☆8,806Updated this week
- Easily fine-tune, evaluate and deploy gpt-oss, Qwen3, DeepSeek-R1, or any open source LLM / VLM!☆8,789Updated last week
- A programming framework for agentic AI☆52,550Updated 2 months ago
- A research prototype of a human-centered web agent☆9,112Updated last week
- The unified stack for multi-agent systems.☆36,070Updated this week
- Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.☆25,791Updated 2 months ago
- Get started with building Fullstack Agents using Gemini 2.5 and LangGraph☆17,537Updated 3 weeks ago