mlfoundations / dclm
DataComp for Language Models
☆1,206Updated last month
Alternatives and similar repositories for dclm:
Users that are interested in dclm are comparing it to the libraries listed below
- Recipes to scale inference-time compute of open models☆932Updated this week
- DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models☆1,083Updated last year
- ☆2,289Updated this week
- MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.☆1,217Updated last month
- A family of open-sourced Mixture-of-Experts (MoE) Large Language Models☆1,425Updated 10 months ago
- [NeurIPS'24 Spotlight] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces in…☆874Updated 2 weeks ago
- Large Reasoning Models☆787Updated last month
- Minimalistic large language model 3D-parallelism training☆1,386Updated this week
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆1,654Updated 5 months ago
- GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection☆1,481Updated 2 months ago
- An Open Large Reasoning Model for Real-World Solutions☆1,378Updated last month
- nanoGPT style version of Llama 3.1☆1,290Updated 5 months ago
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.☆2,147Updated last week
- Scalable toolkit for efficient model alignment☆674Updated this week
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends☆970Updated this week
- Reaching LLaMA2 Performance with 0.1M Dollars☆965Updated 5 months ago
- Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.☆1,908Updated 5 months ago
- Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"☆831Updated last month
- AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:☆1,885Updated 2 weeks ago
- ☆996Updated last month
- Data and tools for generating and inspecting OLMo pre-training data.☆1,060Updated this week
- Scalable RL solution for advanced reasoning of language models☆873Updated this week
- [MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration☆2,669Updated last week
- ReFT: Representation Finetuning for Language Models☆1,373Updated 2 weeks ago
- A reading list on LLM based Synthetic Data Generation 🔥☆969Updated 2 months ago
- Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization☆1,199Updated last month
- ☆4,050Updated 7 months ago
- Tools for merging pretrained large language models.☆5,113Updated last week
- Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.☆687Updated 3 months ago