Ryaang / Web-page-Screenshot-Segmentation
Automatically split long webpage screenshots into chunks for input into models with shorter contexts. 自动将长网页截图进行区块分割,用于输入上下文较短的模型
☆18Updated 5 months ago
Alternatives and similar repositories for Web-page-Screenshot-Segmentation:
Users that are interested in Web-page-Screenshot-Segmentation are comparing it to the libraries listed below
- This is the official repo for "PromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt Optimization". PromptAgen…☆269Updated 8 months ago
- Self-hosted version of Microsoft's OmniParser Image-to-text model☆64Updated 4 months ago
- This repository presents the original implementation of LumberChunker: Long-Form Narrative Document Segmentation by André V. Duarte, João…☆64Updated 6 months ago
- [ICLR 2025] The official implementation of paper "ToolGen: Unified Tool Retrieval and Calling via Generation"☆136Updated 3 weeks ago
- My implementation of "Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language Models"☆98Updated last year
- Easy to deploy.A cloud service for python code interpreter sandbox for Code-Interpreter.☆51Updated last year
- [ACL 2024] AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning☆221Updated 3 months ago
- A library integrating embedding and reranker models from OpenAI, SentenceTransformers etc for semantic search in vector database.☆39Updated 3 weeks ago
- GUI Grounding for Professional High-Resolution Computer Use☆184Updated 2 months ago
- A lightweight script for processing HTML page to markdown format with support for code blocks☆79Updated last year
- Official Repo for The Paper "Talk Structurally, Act Hierarchically: A Collaborative Framework for LLM Multi-Agent Systems"☆50Updated 2 months ago
- Unofficial Pytorch implementation of Dom-LM paper.☆33Updated 2 years ago
- A multimodal RAG application that enables semantic search on multimedia sources like audio, video and images☆35Updated last year
- Official implementation of paper "On the Diagram of Thought" (https://arxiv.org/abs/2409.10038)☆178Updated 3 weeks ago
- Official repo for "LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs".☆230Updated 7 months ago
- Implementation of "RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation".☆230Updated 10 months ago
- ☆27Updated last year
- ☆22Updated 2 months ago
- Build event-driven workflows with python async functions☆34Updated 7 months ago
- We release a general framework for prompting LLMs to manipulate software in a closed-loop manner.☆127Updated 8 months ago
- Enjoy easier conversations with LLM☆33Updated last month
- open-o1: Using GPT-4o with CoT to Create o1-like Reasoning Chains☆115Updated 3 months ago
- Official code for the publication "Large Language Models as Zero-shot Dialogue State Tracker through Function Calling" https//arxiv.org/a…☆60Updated 8 months ago
- ☆51Updated 9 months ago
- An open platform for enhancing the capability of LLMs in workflow orchestration.☆133Updated last month
- [NAACL'24] Dataset, code and models for "TableLlama: Towards Open Large Generalist Models for Tables".☆127Updated 11 months ago
- Code and data for the paper "DBCᴏᴘɪʟᴏᴛ: Natural Language Querying over Massive Database via Schema Routing" (EDBT 2025)☆95Updated last month
- iauto is a low-code engine for building and deploying AI agents☆86Updated 5 months ago
- SpreadsheetBench: Towards Challenging Real World Spreadsheet Manipulation☆19Updated 2 weeks ago
- Mark web pages for use with vision-language models☆35Updated 3 weeks ago