cxcscmu / Craw4LLM
Official repository for "Craw4LLM: Efficient Web Crawling for LLM Pretraining"
☆619Updated 2 months ago
Alternatives and similar repositories for Craw4LLM:
Users that are interested in Craw4LLM are comparing it to the libraries listed below
- Repo for NAACL 2025 Paper "Unfolding the Headline: Iterative Self-Questioning for News Retrieval and Timeline Summarization"☆270Updated 3 months ago
- Repository for the demo and paper: ReasonGraph: Visualisation of Reasoning Paths☆473Updated last month
- 🌐 WebWalker: Benchmarking LLMs in Web Traversal☆390Updated last week
- An agentic company research tool powered by LangGraph and Tavily that conducts deep diligence on companies using a multi-agent framework.…☆546Updated this week
- OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking☆448Updated 2 weeks ago
- OpenAI DeepResearch alternative, An AI-driven research system that performs comprehensive, iterative research on any topic using multiple…☆600Updated last week
- ☆240Updated 8 months ago
- ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents☆468Updated last month
- ☆449Updated last month
- Secretary 是一个自动化的社交媒体分析工具,专门用于关注和分析社交媒体平台上的内容,并通过大模型对内容进行智能分析。该工具能够自动抓取指定账号的最新发言,根据配置的分析提示词进行内容分析,并将分析结果通过企业微信机器人推送给指定用户。通过灵活配置分析提示词,可以针对…☆302Updated 2 weeks ago
- Speech to Text but with all the bells and whistles and most importantly AI! AI will clean up your filler words, edit and will refine what…☆311Updated 3 months ago
- Unsloth Fine-tuning Notebooks for Google Colab, Kaggle, Hugging Face and more.☆299Updated this week
- ☆585Updated last month
- [ICLR 2025] The First Multimodal Seach Engine Pipeline and Benchmark for LMMs☆431Updated 3 months ago
- Mentis: A powerful multi-agent orchestration framework built on LangGraph.☆230Updated 2 weeks ago
- Query and Summarize your chat messages.☆945Updated 5 months ago
- FlexRAG: A RAG Framework for Information Retrieval and Generation.☆165Updated last week
- A General-Purpose AI Agent ✨☆336Updated 2 weeks ago
- ☆222Updated 5 months ago
- AI ContentCraft is an all-in-one content creation suite that helps creators generate stories, podcast scripts, and multimedia content usi…☆334Updated 3 months ago
- ☆575Updated 6 months ago
- Train a Language Model with GRPO to create a schedule from a list of events and priorities☆145Updated last week
- Your first AI prompt engineer☆376Updated 6 months ago
- Scira (Formerly MiniPerplx) is a minimalistic AI-powered search engine that helps you find information on the internet. Powered by Vercel…☆117Updated 2 months ago
- A Model Context Protocol server for searching and analyzing arXiv papers☆1,064Updated 2 weeks ago
- Build & Optimize your RAG.☆643Updated 2 weeks ago
- Full Stack application for retrieving Stock Data and News using LLM, LangChain and LangGraph☆604Updated 5 months ago
- recursive rag with r1 reasoning☆290Updated 2 months ago
- 🧠 世界上覆盖最全的优秀Qwen提示语大全,欢迎贡献你的提示词。🧠 The most comprehensive collection of excellent Qwen prompts in the world. Feel free to contribute you…☆214Updated 5 months ago
- ☆462Updated 2 months ago