cxcscmu / Craw4LLMLinks
Official repository for "Craw4LLM: Efficient Web Crawling for LLM Pretraining"
☆632Updated 4 months ago
Alternatives and similar repositories for Craw4LLM
Users that are interested in Craw4LLM are comparing it to the libraries listed below
Sorting:
- [ACL 2025 Demo] Repository for the demo and paper: ReasonGraph: Visualisation of Reasoning Paths☆492Updated last month
- Repo for NAACL 2025 Paper "Unfolding the Headline: Iterative Self-Questioning for News Retrieval and Timeline Summarization"☆280Updated 5 months ago
- A simple agent framework that's capable of browser use + mcp + auto instrument + plan + deep research + more☆296Updated last month
- Mentis: A powerful multi-agent orchestration framework built on LangGraph.☆257Updated last month
- OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking☆454Updated 2 months ago
- OpenAI DeepResearch alternative, An AI-driven research system that performs comprehensive, iterative research on any topic using multiple…☆617Updated last month
- AnyCrawl 🚀: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts structured SERP results from Google/Bing/B…☆554Updated last week
- ☆258Updated 10 months ago
- Secretary is an AI-powered tool that analyzes social media content from specified accounts and delivers results via WeChat. It supports c…☆330Updated last month
- MultiAgentPPT 是一个集成了 A2A(Agent2Agent)+ MCP(Model Context Protocol)+ ADK(Agent Development Kit) 架构的智能化演示文稿生成系统,支持通过多智能体协作和流式并发机制☆771Updated this week
- Lemon AI is the first Full-stack, Open-source, Agentic AI framework, offering a fully local alternative to platforms like Manus & Genspar…☆515Updated this week
- ☆226Updated last month
- [ICLR 2025] The First Multimodal Seach Engine Pipeline and Benchmark for LMMs☆446Updated 5 months ago
- Speech to Text but with all the bells and whistles and most importantly AI! AI will clean up your filler words, edit and will refine what…☆314Updated 5 months ago
- The Level-Navi Agent, a framework that requires no training and utilizes large language models for deep query understanding and precise s…☆79Updated 6 months ago
- ☆496Updated 4 months ago
- ☆477Updated 4 months ago
- ☆582Updated 8 months ago
- ☆239Updated 2 months ago
- ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents☆511Updated last month
- A General-Purpose AI Agent ✨☆379Updated 2 weeks ago
- AI ContentCraft is an all-in-one content creation suite that helps creators generate stories, podcast scripts, and multimedia content usi…☆364Updated last week
- recursive rag with r1 reasoning☆325Updated last month
- python package to parse pdfs with different parsers☆195Updated 7 months ago
- A MCP (Model Context Protocol) server for PowerPoint manipulation using python-pptx. This server provides tools for creating, editing, an…☆573Updated 3 weeks ago
- 🍎APPL: A Prompt Programming Language. Seamlessly integrate LLMs with programs.☆251Updated 4 months ago
- ☆1,034Updated this week
- 这是一个基于Model Context Protocol (MCP)的服务器,用于根据用户任务需求提供预设的prompt模板,帮助Cline/Cursor/Windsurf...更高效地执行各种任务。服务器将预设的prompt作为工具(tools)返回,以便在Cursor和…☆569Updated last month
- MemoryOS is designed to provide a memory operating system for personalized AI agents.☆423Updated this week
- Query and Summarize your chat messages.☆1,000Updated 7 months ago