Ryaang / Web-page-Screenshot-Segmentation

Automatically split long webpage screenshots into chunks for input into models with shorter contexts. 自动将长网页截图进行区块分割，用于输入上下文较短的模型

☆18

Alternatives and similar repositories for Web-page-Screenshot-Segmentation:

Users that are interested in Web-page-Screenshot-Segmentation are comparing it to the libraries listed below

XinyuanWangCS / PromptAgent
This is the official repo for "PromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt Optimization". PromptAgen…
☆269Updated 8 months ago
addy999 / omniparser-api
Self-hosted version of Microsoft's OmniParser Image-to-text model
☆64Updated 4 months ago
joaodsmarques / LumberChunker
This repository presents the original implementation of LumberChunker: Long-Form Narrative Document Segmentation by André V. Duarte, João…
☆64Updated 6 months ago
Reason-Wang / ToolGen
[ICLR 2025] The official implementation of paper "ToolGen: Unified Tool Retrieval and Calling via Generation"
☆136Updated 3 weeks ago
kyegomez / Algorithm-Of-Thoughts
My implementation of "Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language Models"
☆98Updated last year
zhangzhejian / codeinterpreter-codebox
Easy to deploy.A cloud service for python code interpreter sandbox for Code-Interpreter.
☆51Updated last year
zjunlp / AutoAct
[ACL 2024] AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning
☆221Updated 3 months ago
milvus-io / milvus-model
A library integrating embedding and reranker models from OpenAI, SentenceTransformers etc for semantic search in vector database.
☆39Updated 3 weeks ago
likaixin2000 / ScreenSpot-Pro-GUI-Grounding
GUI Grounding for Professional High-Resolution Computer Use
☆184Updated 2 months ago
SivilTaram / code-html-to-markdown
A lightweight script for processing HTML page to markdown format with support for code blocks
☆79Updated last year
sony / talkhier
Official Repo for The Paper "Talk Structurally, Act Hierarchically: A Collaborative Framework for LLM Multi-Agent Systems"
☆50Updated 2 months ago
ilyalasy / DOM-LM
Unofficial Pytorch implementation of Dom-LM paper.
☆33Updated 2 years ago
deepsearch-ai / deepsearch
A multimodal RAG application that enables semantic search on multimedia sources like audio, video and images
☆35Updated last year
diagram-of-thought / diagram-of-thought
Official implementation of paper "On the Diagram of Thought" (https://arxiv.org/abs/2409.10038)
☆178Updated 3 weeks ago
TIGER-AI-Lab / LongRAG
Official repo for "LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs".
☆230Updated 7 months ago
CraftJarvis / RAT
Implementation of "RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation".
☆230Updated 10 months ago
1rgs / jsonformer-claude
☆27Updated last year
lightblue-tech / lb-reranker
☆22Updated 2 months ago
memodb-io / drive-flow
Build event-driven workflows with python async functions
☆34Updated 7 months ago
BraveGroup / SheetCopilot
We release a general framework for prompting LLMs to manipulate software in a closed-loop manner.
☆127Updated 8 months ago
modelscope / PromptScope
Enjoy easier conversations with LLM
☆33Updated last month
shibing624 / open-o1
open-o1: Using GPT-4o with CoT to Create o1-like Reasoning Chains
☆115Updated 3 months ago
facebookresearch / FnCTOD
Official code for the publication "Large Language Models as Zero-shot Dialogue State Tracker through Function Calling" https//arxiv.org/a…
☆60Updated 8 months ago
read-agent / read-agent.github.io
☆51Updated 9 months ago
OpenBMB / WorkflowLLM
An open platform for enhancing the capability of LLMs in workflow orchestration.
☆133Updated last month
OSU-NLP-Group / TableLlama
[NAACL'24] Dataset, code and models for "TableLlama: Towards Open Large Generalist Models for Tables".
☆127Updated 11 months ago
tshu-w / DBCopilot
Code and data for the paper "DBCᴏᴘɪʟᴏᴛ: Natural Language Querying over Massive Database via Schema Routing" (EDBT 2025)
☆95Updated last month
shellc / iauto
iauto is a low-code engine for building and deploying AI agents
☆86Updated 5 months ago
RUCKBReasoning / SpreadsheetBench
SpreadsheetBench: Towards Challenging Real World Spreadsheet Manipulation
☆19Updated 2 weeks ago
reidbarber / webmarker
Mark web pages for use with vision-language models
☆35Updated 3 weeks ago