EZ-hwh / AutoScraperLinks

Official implement of paper "AutoScraper: A Progressive Understanding Web Agent for Web Scraper Generation" [EMNLP 24']

☆474

Alternatives and similar repositories for AutoScraper

Users that are interested in AutoScraper are comparing it to the libraries listed below

Sorting:

catid / self-discover
Implementation of Google's SELF-DISCOVER
☆298Updated 11 months ago
SciPhi-AI / agent-search
AgentSearch is a framework for powering search agents and enabling customizable local search.
☆497Updated last year
simbianai / taskgen
Task-based Agentic Framework using StrictJSON as the core
☆455Updated 3 weeks ago
superagent-ai / super-rag
Super performant RAG pipelines for AI apps. Summarization, Retrieve/Rerank and Code Interpreters in one simple API.
☆381Updated last year
dyabel / AnyTool
☆306Updated last year
agent-husky / Husky-v1
Code for Husky, an open-source language agent that solves complex, multi-step reasoning tasks. Husky v1 addresses numerical, tabular and …
☆345Updated last year
PragmaticMachineLearning / docai
Structured information extraction from documents
☆317Updated 10 months ago
nexusflowai / NexusRaven
NexusRaven-13B, a new SOTA Open-Source LLM for function calling. This repo contains everything for reproducing our evaluation on NexusRav…
☆316Updated last year
misbahsy / RAGTune
Tuning and Evaluation of RAG pipeline. (Automated optimization to be added soon)
☆264Updated last year
cohere-ai / cohere-terrarium
A simple Python sandbox for helpful LLM data agents
☆277Updated last year
suzgunmirac / meta-prompting
Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding
☆397Updated last year
allenai / lumos
Code and data for "Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs"
☆467Updated last year
vaughanlove / PromptBreeder
Google Deepmind's PromptBreeder for automated prompt engineering implemented in langchain expression language.
☆132Updated last year
KarelDO / xmc.dspy
In-Context Learning for eXtreme Multi-Label Classification (XMC) using only a handful of examples.
☆434Updated last year
plageon / HtmlRAG
HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieval Results in RAG Systems (WWW 2025)
☆433Updated last month
tjmlabs / AgentRun
The easiest, and fastest way to run AI-generated Python code safely
☆330Updated 8 months ago
lamini-ai / Lamini-Memory-Tuning
Banishing LLM Hallucinations Requires Rethinking Generalization
☆276Updated last year
OSU-NLP-Group / SeeAct
[ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large mult…
☆768Updated 6 months ago
McGill-NLP / weblinx
WebLINX is a benchmark for building web navigation agents with conversational capabilities
☆156Updated 5 months ago
CYQIQ / MultiCoT
Repository to demonstrate Chain of Table reasoning with multiple tables powered by LangGraph
☆147Updated last year
kerekovskik / autologic
autologic is a Python package that implements the SELF-DISCOVER framework proposed in the paper SELF-DISCOVER: Large Language Models Self…
☆60Updated last year
RCGAI / SimplyRetrieve
Lightweight chat AI platform featuring custom knowledge, open-source LLMs, prompt-engineering, retrieval analysis. Highly customizable. F…
☆212Updated last year
harishsg993010 / LLM-Research-Scripts
☆434Updated 10 months ago
migtissera / Sensei
Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAI
☆222Updated last year
ganarajpr / awesome-dspy
An Awesome list of curated DSPy resources.
☆390Updated 5 months ago
khive-ai / lionagi
AGI SDK
☆356Updated this week
aurelio-labs / semantic-chunkers
☆231Updated last month
spcl / MRAG
Official Implementation of "Multi-Head RAG: Solving Multi-Aspect Problems with LLMs"
☆222Updated last month
langchain-ai / langchain-benchmarks
🦜💯 Flex those feathers!
☆253Updated 9 months ago
deep-diver / llamaduo
[ACL'25] Official Code for LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to Small-Scale Local LLMs
☆313Updated 3 weeks ago