GAIR-NLP / MathPile
[NeurlPS D&B 2024] Generative AI for Math: MathPile
☆409Updated this week
Alternatives and similar repositories for MathPile:
Users that are interested in MathPile are comparing it to the libraries listed below
- ☆147Updated 10 months ago
- Code and data for "MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning" (ICLR 2024)☆364Updated 7 months ago
- SOTA Math Opensource LLM☆331Updated last year
- Implementation of paper Data Engineering for Scaling Language Models to 128K Context☆454Updated last year
- a Fine-tuned LLaMA that is Good at Arithmetic Tasks☆177Updated last year
- ☆312Updated 6 months ago
- Offical Repo for "Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale"☆229Updated last month
- [ACL 2024] Progressive LLaMA with Block Expansion.☆499Updated 10 months ago
- ☆504Updated 4 months ago
- Family of LLMs for mathematical reasoning.☆256Updated 3 months ago
- Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]☆542Updated 3 months ago
- A large-scale, fine-grained, diverse preference dataset (and models).☆335Updated last year
- An opensource ChatBot built with ExpertPrompting which achieves 96% of ChatGPT's capability.☆300Updated last year
- ☆264Updated 8 months ago
- FireAct: Toward Language Agent Fine-tuning☆274Updated last year
- GPT-Fathom is an open-source and reproducible LLM evaluation suite, benchmarking 10+ leading open-source and closed-source LLMs as well a…☆349Updated 11 months ago
- Evaluation suite for LLMs☆339Updated 3 months ago
- ☆325Updated last month
- PyTorch building blocks for the OLMo ecosystem☆177Updated this week
- [ACL'24 Outstanding] Data and code for L-Eval, a comprehensive long context language models evaluation benchmark☆374Updated 8 months ago
- Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models☆251Updated 6 months ago
- This is the repo for the paper Shepherd -- A Critic for Language Model Generation☆218Updated last year
- Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718☆313Updated 6 months ago
- A repository sharing the literatures about long-context large language models, including the methodologies and the evaluation benchmarks☆260Updated 8 months ago
- Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.☆706Updated 6 months ago
- Official codebase for "SelFee: Iterative Self-Revising LLM Empowered by Self-Feedback Generation"☆226Updated last year
- The official evaluation suite and dynamic data release for MixEval.☆233Updated 4 months ago
- Code for Quiet-STaR☆721Updated 7 months ago
- ☆120Updated 9 months ago
- Unofficial implementation of AlpaGasus☆90Updated last year