cxcscmu / Craw4LLMLinks
Official repository for "Craw4LLM: Efficient Web Crawling for LLM Pretraining"
☆638Updated 6 months ago
Alternatives and similar repositories for Craw4LLM
Users that are interested in Craw4LLM are comparing it to the libraries listed below
Sorting:
- [ACL 2025 Demo] Repository for the demo and paper: ReasonGraph: Visualisation of Reasoning Paths☆497Updated 3 months ago
- Repo for NAACL 2025 Paper "Unfolding the Headline: Iterative Self-Questioning for News Retrieval and Timeline Summarization"☆283Updated last month
- [EMNLP 2025] OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking☆456Updated 3 weeks ago
- Mentis: A powerful multi-agent orchestration framework built on LangGraph.☆281Updated 3 months ago
- A simple agent framework that's capable of browser use + mcp + auto instrument + plan + deep research + more☆315Updated last week
- OpenAI DeepResearch alternative, An AI-driven research system that performs comprehensive, iterative research on any topic using multiple…☆629Updated 3 months ago
- ☆273Updated last year
- Speech to Text but with all the bells and whistles and most importantly AI! AI will clean up your filler words, edit and will refine what…☆320Updated 7 months ago
- ☆337Updated 2 weeks ago
- [ICLR 2025] The First Multimodal Seach Engine Pipeline and Benchmark for LMMs☆472Updated 7 months ago
- ☆237Updated 3 months ago
- Secretary is an AI-powered tool that analyzes social media content from specified accounts and delivers results via WeChat. It supports c…☆340Updated last month
- ☆597Updated 10 months ago
- recursive rag with r1 reasoning☆329Updated 3 months ago
- ☆261Updated 4 months ago
- ReMe: Memory Management Framework for Agents - Remember Me, Refine Me.☆529Updated this week
- A General-Purpose AI Agent ✨☆393Updated last month
- [EMNLP 2025] ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents☆534Updated 3 months ago
- The Level-Navi Agent, a framework that requires no training and utilizes large language models for deep query understanding and precise s…☆81Updated 8 months ago
- AI-Powered Video Retrieval & Clipping Tool☆334Updated 3 weeks ago
- AI ContentCraft is an all-in-one content creation suite that helps creators generate stories, podcast scripts, and multimedia content usi…☆372Updated 2 months ago
- AI视频剪辑☆218Updated last month
- python package to parse pdfs with different parsers☆200Updated last week
- Fogsight is an AI agent and animation engine powered by Large Language Models.☆915Updated 2 weeks ago
- A complete 7-layer intelligent memory system for AI Agents with multi-modal memory fusion also support context_engineering☆122Updated 2 months ago
- ☆491Updated 6 months ago
- Scout 是一个基于 Roo Code VS Code 扩展 设计的实验性 Agent 实现。它专注于通过模拟人类行为进行精准的网络信息收集、研究与交互,旨在将 Roo Code 转变为一个强大的 Web 研究助手。☆114Updated 5 months ago
- 🧠 世界上覆盖最全的优秀Qwen提示语大全,欢迎贡献你的提示词。🧠 The most comprehensive collection of excellent Qwen prompts in the world. Feel free to contribute you…☆264Updated 3 weeks ago
- Semantic Search on Wikipedia with Upstash Vector☆472Updated 5 months ago
- Lemon AI is the first Full-stack, Open-source, Agentic AI framework, offering a fully local alternative to platforms like Manus & Genspar…☆643Updated last month