Ongoing research training transformer models at scale
β395Aug 20, 2024Updated last year
Alternatives and similar repositories for Megatron-LM
Users that are interested in Megatron-LM are comparing it to the libraries listed below
Sorting:
- Home of StarCoder: fine-tuning & inference!β7,530Feb 27, 2024Updated 2 years ago
- π OctoPack: Instruction Tuning Code Large Language Modelsβ478Feb 5, 2025Updated last year
- CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.β5,169Oct 27, 2025Updated 4 months ago
- β491Aug 15, 2024Updated last year
- A framework for the evaluation of autoregressive code generation language models.β1,020Jul 22, 2025Updated 7 months ago
- LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMathβ9,477Jun 7, 2025Updated 8 months ago
- CodeGen2 models for program synthesisβ271Jun 12, 2023Updated 2 years ago
- Ongoing research training transformer language models at scale, including: BERT & GPT-2β1,436Mar 20, 2024Updated last year
- Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.β1,008Jul 29, 2024Updated last year
- β15Oct 24, 2023Updated 2 years ago
- Fine-tune SantaCoder for Code/Text Generation.β196Apr 11, 2023Updated 2 years ago
- distributed trainer for LLMsβ589May 20, 2024Updated last year
- Repository for analysis and experiments in the BigCode project.β128Mar 20, 2024Updated last year
- This is the official code for the paper CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning (Neurβ¦β558Jan 21, 2025Updated last year
- CodeTF: One-stop Transformer Library for State-of-the-art Code LLMβ1,481May 1, 2025Updated 10 months ago
- Simple Implementation of a Transformer in the new framework MLX by Appleβ19Nov 18, 2024Updated last year
- Large Language Model Text Generation Inferenceβ10,788Jan 8, 2026Updated last month
- Ongoing research training transformer models at scaleβ15,461Updated this week
- Dromedary: towards helpful, ethical and reliable LLMs.β1,144Sep 18, 2025Updated 5 months ago
- Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adβ¦β6,081Jul 1, 2025Updated 8 months ago
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.β39,414Jun 2, 2025Updated 9 months ago
- APPS: Automated Programming Progress Standard (NeurIPS 2021)β510Jun 19, 2024Updated last year
- An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed librariesβ7,395Feb 3, 2026Updated last month
- β1,505May 12, 2023Updated 2 years ago
- Home of CodeT5: Open Code LLMs for Code Understanding and Generationβ3,096Jan 20, 2024Updated 2 years ago
- β282Apr 25, 2023Updated 2 years ago
- A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)β4,741Jan 8, 2024Updated 2 years ago
- Scaling Data-Constrained Language Modelsβ342Jun 28, 2025Updated 8 months ago
- Accessible large language models via k-bit quantization for PyTorch.β7,997Updated this week
- Astraios: Parameter-Efficient Instruction Tuning Code Language Modelsβ63Apr 10, 2024Updated last year
- OpenLLaMA, a permissively licensed open source reproduction of Meta AIβs LLaMA 7B trained on the RedPajama datasetβ7,533Jul 16, 2023Updated 2 years ago
- The RedPajama-Data repository contains code for preparing large datasets for training large language models.β4,923Dec 7, 2024Updated last year
- QLoRA: Efficient Finetuning of Quantized LLMsβ10,843Jun 10, 2024Updated last year
- Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flβ¦β2,516Aug 13, 2024Updated last year
- The hub for EleutherAI's work on interpretability and learning dynamicsβ2,740Nov 15, 2025Updated 3 months ago
- LLM powered development for VSCodeβ1,318Jul 17, 2024Updated last year
- C++ implementation for π«StarCoderβ459Sep 9, 2023Updated 2 years ago
- Open Multilingual Chatbot for Everyoneβ1,277Jun 8, 2025Updated 8 months ago
- Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasksβ209Jan 13, 2024Updated 2 years ago