mlfoundations / dclm
DataComp for Language Models
☆1,108Updated 2 weeks ago
Related projects: ⓘ
- A family of open-sourced Mixture-of-Experts (MoE) Large Language Models☆1,352Updated 6 months ago
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.☆1,935Updated last week
- A Native-PyTorch Library for LLM Fine-tuning☆3,942Updated this week
- S-LoRA: Serving Thousands of Concurrent LoRA Adapters☆1,698Updated 7 months ago
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆1,451Updated last month
- DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models☆970Updated 8 months ago
- Minimalistic large language model 3D-parallelism training☆1,111Updated this week
- Reaching LLaMA2 Performance with 0.1M Dollars☆955Updated last month
- AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:☆1,624Updated this week
- DeepSeek LLM: Let there be answers☆1,396Updated 7 months ago
- ☆1,164Updated last week
- nanoGPT style version of Llama 3.1☆1,162Updated last month
- To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up t…☆696Updated last week
- Tools for merging pretrained large language models.☆4,501Updated this week
- SGLang is a fast serving framework for large language models and vision language models.☆5,121Updated this week
- Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs☆2,080Updated this week
- High-quality datasets, tools, and concepts for LLM fine-tuning.☆1,664Updated last month
- Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"☆774Updated 3 weeks ago
- Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI☆1,309Updated 5 months ago
- GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection☆1,354Updated last week
- [MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration☆2,333Updated 2 months ago
- DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models☆782Updated 5 months ago
- ☆835Updated 2 months ago
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…☆1,396Updated this week
- Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.☆1,756Updated last month
- TextGrad: Automatic ''Differentiation'' via Text -- using large language models to backpropagate textual gradients.☆1,542Updated last week
- MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.☆929Updated 2 weeks ago
- ReFT: Representation Finetuning for Language Models☆1,076Updated 2 weeks ago
- Official repo for the paper "Scaling Synthetic Data Creation with 1,000,000,000 Personas"☆769Updated 2 months ago
- Training LLMs with QLoRA + FSDP☆1,382Updated last week