TencentQQGYLab / AppAgent
AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.
☆5,807Updated last month
Alternatives and similar repositories for AppAgent
Users that are interested in AppAgent are comparing it to the libraries listed below
Sorting:
- Mobile-Agent: The Powerful Mobile Device Operation Assistant Family☆4,190Updated last month
- An Autonomous LLM Agent for Complex Task Solving☆8,330Updated 9 months ago
- The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling st…☆2,090Updated 6 months ago
- [COLM 2024] OpenAgents: An Open Platform for Language Agents in the Wild☆4,286Updated 5 months ago
- 🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)☆6,351Updated 4 months ago
- The Desktop AgentOS.☆7,222Updated this week
- An LLM-based Web Navigating Agent (KDD'24)☆858Updated 7 months ago
- Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.☆8,648Updated this week
- MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone☆19,414Updated 2 months ago
- GPT4V-level open-source multi-modal model based on Llama3-8B☆2,348Updated 2 months ago
- An open source AI wearable device that captures what you say and hear in the real world and then transcribes and stores it on your own se…☆3,034Updated last year
- MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.☆7,348Updated 6 months ago
- Official implementation code of the paper <AnyText: Multilingual Visual Text Generation And Editing>☆4,642Updated 2 months ago
- The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.☆5,878Updated 9 months ago
- This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.☆11,961Updated last month
- Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/☆8,723Updated last week
- [ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.☆5,027Updated 5 months ago
- a state-of-the-art-level open visual language model | 多模态预训练模型☆6,531Updated 11 months ago
- BISHENG is an open LLM devops platform for next generation Enterprise AI applications. Powerful and comprehensive features include: GenAI…☆8,457Updated this week
- Python scraper based on AI☆19,570Updated this week
- An open-sourced end-to-end VLM-based GUI Agent☆939Updated last month
- [CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型☆8,087Updated 2 weeks ago
- Multi agent system for AI-driven software development. Combine LLM with DevOps tools to convert natural language requirements into workin…☆5,901Updated 9 months ago
- Crawl a site to generate knowledge files to create your own custom GPT from a URL☆21,473Updated 3 months ago
- 🤖 AgentVerse 🪐 is designed to facilitate the deployment of multiple LLM-based agents in various applications, which primarily provides …☆4,534Updated 8 months ago
- LLM based autonomous agent that conducts deep local and web research on any topic and generates a long report with citations.☆21,399Updated this week
- AI agent using GPT-4V(ision) capable of using a mouse/keyboard to interact with web UI☆1,034Updated 5 months ago
- AIOS: AI Agent Operating System☆4,128Updated last week
- ModelScope-Agent: An agent framework connecting models in ModelScope with the world☆3,121Updated this week
- A generalized information-seeking agent system with Large Language Models (LLMs).☆1,160Updated 10 months ago