Alab-NII / chain-of-thoughtLinks
Research papers about Chain of Thought (CoT)
☆57Updated 2 years ago
Alternatives and similar repositories for chain-of-thought
Users that are interested in chain-of-thought are comparing it to the libraries listed below
Sorting:
- A dataset of LLM-generated chain-of-thought steps annotated with mistake location.☆82Updated last year
- DocBench: A Benchmark for Evaluating LLM-based Document Reading Systems☆53Updated last year
- EMNLP'23 survey: a curation of awesome papers and resources on refreshing large language models (LLMs) without expensive retraining.☆136Updated last year
- List of papers on Self-Correction of LLMs.☆80Updated 10 months ago
- Code implementation of synthetic continued pretraining☆137Updated 10 months ago
- Self-Evolved Diverse Data Sampling for Efficient Instruction Tuning☆85Updated last year
- This is the code repo for our paper "Enhancing Knowledge Integration and Utilization of Large Language Models via Constructivist Cognitio…☆109Updated last month
- [Neurips2024] Source code for xRAG: Extreme Context Compression for Retrieval-augmented Generation with One Token☆157Updated last year
- Co-LLM: Learning to Decode Collaboratively with Multiple Language Models☆122Updated last year
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]☆149Updated last year
- [ICLR'24 spotlight] Tool-Augmented Reward Modeling☆51Updated 5 months ago
- ☆74Updated last year
- [ICLR 2025] InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales☆128Updated 9 months ago
- Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting☆33Updated last year
- [ACL 2024] LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement☆190Updated last year
- Open source code of the paper: "OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain"☆76Updated 10 months ago
- ☆68Updated 2 years ago
- Official repository for paper "TableBench: A Comprehensive and Complex Benchmark for Table Question Answering"☆74Updated 6 months ago
- WideSearch: Benchmarking Agentic Broad Info-Seeking☆98Updated last month
- The source code and dataset mentioned in the paper Seal-Tools: Self-Instruct Tool Learning Dataset for Agent Tuning and Detailed Benchmar…☆51Updated last year
- [ACL 2025 Findings] Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts (As Huggingface Daily Papers: …☆88Updated 2 months ago
- Contrastive Chain-of-Thought Prompting☆68Updated last year
- [ICML 2025] Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale☆263Updated 4 months ago
- ☆151Updated 3 weeks ago
- a curated list of the role of small models in the LLM era☆107Updated last year
- [NeurIPS 2023] This is the code for the paper `Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias`.☆155Updated 2 years ago
- HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models☆52Updated 11 months ago
- [NeurIPS 2024] Train LLMs with diverse system messages reflecting individualized preferences to generalize to unseen system messages☆51Updated 3 months ago
- The code and data of DPA-RAG, accepted by WWW 2025 main conference.☆63Updated 2 weeks ago
- MemoChat: Tuning LLMs to Use Memos for Consistent Long-Range Open-Domain Conversation☆28Updated last year