Code for Paper: Harnessing Webpage Uis For Text Rich Visual Understanding
☆53Dec 12, 2024Updated last year
Alternatives and similar repositories for MultiUI
Users that are interested in MultiUI are comparing it to the libraries listed below
Sorting:
- [ICLR'25 Oral] UGround: Universal GUI Visual Grounding for GUI Agents☆298Jul 18, 2025Updated 7 months ago
- A huge dataset for Document Visual Question Answering☆20Jul 29, 2024Updated last year
- Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"☆64Oct 19, 2024Updated last year
- ☆20Apr 24, 2024Updated last year
- Seamless Voice Interactions with LLMs☆12Oct 28, 2023Updated 2 years ago
- Detecting Drift in a Diabetes Dataset using Taipy☆12May 19, 2025Updated 9 months ago
- TAT-DQA: Towards Complex Document Understanding By Discrete Reasoning☆23Sep 17, 2024Updated last year
- An Analytical Evaluation Board of Multi-turn LLM Agents [NeurIPS 2024 Oral]☆395May 20, 2024Updated last year
- Interface for GenAI-Arena [NeurIPS24]☆17Feb 27, 2024Updated 2 years ago
- 8+ agents work together to build a game in pygame☆15Jul 27, 2024Updated last year
- OS-ATLAS: A Foundation Action Model For Generalist GUI Agents☆435Apr 20, 2025Updated 10 months ago
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆70Dec 9, 2024Updated last year
- ChatGPT-like interface for working with AI Agents☆20Sep 18, 2024Updated last year
- We introduce OpenStory++, a large-scale open-domain dataset focusing on enabling MLLMs to perform storytelling generation tasks.☆16Aug 30, 2024Updated last year
- Taipy Demo of a Realtime Dashboard of Air Pollution around a Factory☆17May 20, 2025Updated 9 months ago
- Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments (EMNLP'2024)☆37Dec 29, 2024Updated last year
- (ICLR 2025) The Official Code Repository for GUI-World.☆68Dec 18, 2024Updated last year
- [ACL 2025] AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant☆44Dec 19, 2024Updated last year
- Modern vibecoding Three.js Starter Kit with Cloudflare Deployment☆51Apr 18, 2025Updated 10 months ago
- R1-like Computer-use Agent☆89Mar 21, 2025Updated 11 months ago
- Simple LaMa Inpainting: An easy-to-use implementation of the LaMa (Large Mask) inpainting model. Remove unwanted objects or fill in missi…☆23Nov 5, 2024Updated last year
- A multi-page application to visualize and predict Covid numbers☆22May 19, 2025Updated 9 months ago
- ☆18Oct 19, 2024Updated last year
- LocalPlexity is a lite version of Perplexity aimed at 100% privacy and openness. Everything is done locally, in your browser, from search…☆21Aug 12, 2024Updated last year
- ☆30Jul 3, 2025Updated 7 months ago
- A benchmark that focuses on the sampling dilemma in long-video tasks. Through well-designed tasks, it evaluates the sampling efficiency o…☆26Aug 7, 2025Updated 6 months ago
- Animefy: ComfyUI workflow designed to convert images or videos into an anime-like style automatically.☆22Jul 2, 2024Updated last year
- Functional Benchmarks and the Reasoning Gap☆89Oct 1, 2024Updated last year
- This repository provides scripts for evaluating NLP models on the LEXTREME benchmark, a set of diverse multilingual tasks in legal NLP☆23Dec 28, 2023Updated 2 years ago
- Extensive Self-Contrast Enables Feedback-Free Language Model Alignment☆21Apr 2, 2024Updated last year
- Official implementation of Zero-Hero paper☆30Feb 13, 2025Updated last year
- ☆30Jul 5, 2023Updated 2 years ago
- Streamlit Web UI for AGiXT☆28Jan 8, 2026Updated last month
- ☆26Aug 29, 2024Updated last year
- Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Models☆101Oct 19, 2023Updated 2 years ago
- Web Interface for Vision Language Models Including InternVLM2☆25Jul 29, 2024Updated last year
- Data and code for the ICLR 2023 paper "Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning".☆165Dec 27, 2023Updated 2 years ago
- [NeurIPS'25] GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents☆385Feb 11, 2026Updated 2 weeks ago
- Official PyTorch implementation of the paper "Equivariant Image Modeling"(https://arxiv.org/abs/2503.18948)☆35Aug 1, 2025Updated 6 months ago