mlfoundations / dclm
DataComp for Language Models
☆1,157Updated this week
Related projects ⓘ
Alternatives and complementary repositories for dclm
- [NeurIPS'24 Spotlight] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces in…☆791Updated this week
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆1,565Updated 3 months ago
- DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models☆1,008Updated 10 months ago
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.☆2,045Updated this week
- Scalable data pre processing and curation toolkit for LLMs☆615Updated this week
- Minimalistic large language model 3D-parallelism training☆1,260Updated this week
- A family of open-sourced Mixture-of-Experts (MoE) Large Language Models☆1,391Updated 8 months ago
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…☆1,634Updated this week
- Large Reasoning Models☆580Updated this week
- ☆878Updated 5 months ago
- nanoGPT style version of Llama 3.1☆1,246Updated 3 months ago
- High-quality datasets, tools, and concepts for LLM fine-tuning.☆2,010Updated 3 weeks ago
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends☆811Updated this week
- S-LoRA: Serving Thousands of Concurrent LoRA Adapters☆1,755Updated 10 months ago
- Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs☆2,205Updated this week
- Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"☆803Updated 3 months ago
- DeepSeek LLM: Let there be answers☆1,451Updated 9 months ago
- Official repo for the paper "Scaling Synthetic Data Creation with 1,000,000,000 Personas"☆883Updated last month
- Code for Quiet-STaR☆651Updated 3 months ago
- Evaluate your LLM's response with Prometheus and GPT4 💯☆797Updated 2 months ago
- Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI☆1,336Updated 7 months ago
- Data and tools for generating and inspecting OLMo pre-training data.☆993Updated this week
- Scalable toolkit for efficient model alignment☆620Updated this week
- Reaching LLaMA2 Performance with 0.1M Dollars☆960Updated 3 months ago
- OLMoE: Open Mixture-of-Experts Language Models☆460Updated this week
- AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:☆1,765Updated this week
- From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)☆594Updated 3 weeks ago
- A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).☆744Updated this week
- Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.☆647Updated last month