mistralai / megablocks-public
☆860Updated 11 months ago
Related projects ⓘ
Alternatives and complementary repositories for megablocks-public
- ☆465Updated 2 months ago
- Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI☆1,333Updated 6 months ago
- Arena-Hard-Auto: An automatic LLM benchmark.☆637Updated this week
- Extend existing LLMs way beyond the original training length with constant memory usage, without retraining☆667Updated 6 months ago
- A family of open-sourced Mixture-of-Experts (MoE) Large Language Models☆1,385Updated 8 months ago
- ☆411Updated last year
- YaRN: Efficient Context Window Extension of Large Language Models☆1,339Updated 6 months ago
- [ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding☆1,143Updated 3 weeks ago
- ☆445Updated last week
- Inference code for Mistral and Mixtral hacked up into original Llama implementation☆373Updated 11 months ago
- ☆634Updated 2 weeks ago
- Minimalistic large language model 3D-parallelism training☆1,227Updated this week
- Inference code for Persimmon-8B☆417Updated last year
- A bagel, with everything.☆312Updated 6 months ago
- Code for Quiet-STaR☆639Updated 2 months ago
- Fine-tune mistral-7B on 3090s, a100s, h100s☆702Updated last year
- Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"☆801Updated 2 months ago
- Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.☆642Updated last month
- Scalable toolkit for efficient model alignment☆611Updated this week
- The official implementation of Self-Play Fine-Tuning (SPIN)☆1,034Updated 6 months ago
- ☆448Updated 7 months ago
- Serving multiple LoRA finetuned LLM as one☆979Updated 6 months ago
- batched loras☆336Updated last year
- ☆527Updated 9 months ago
- This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?☆694Updated 2 weeks ago
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…☆1,612Updated this week
- Code for fine-tuning Platypus fam LLMs using LoRA☆623Updated 9 months ago
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆1,551Updated 2 months ago
- This repository contains code and tooling for the Abacus.AI LLM Context Expansion project. Also included are evaluation scripts and bench…☆582Updated 11 months ago