Ongoing research training transformer models at scale
β396Aug 20, 2024Updated last year
Alternatives and similar repositories for Megatron-LM
Users that are interested in Megatron-LM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Home of StarCoder: fine-tuning & inference!β7,522Feb 27, 2024Updated 2 years ago
- π OctoPack: Instruction Tuning Code Large Language Modelsβ479Feb 5, 2025Updated last year
- β492Aug 15, 2024Updated last year
- A framework for the evaluation of autoregressive code generation language models.β1,032Jul 22, 2025Updated 8 months ago
- CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.β5,175Oct 27, 2025Updated 5 months ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMathβ9,471Jun 7, 2025Updated 10 months ago
- CodeGen2 models for program synthesisβ271Jun 12, 2023Updated 2 years ago
- Fine-tune SantaCoder for Code/Text Generation.β197Apr 11, 2023Updated 3 years ago
- Ongoing research training transformer language models at scale, including: BERT & GPT-2β1,437Mar 20, 2024Updated 2 years ago
- β15Oct 24, 2023Updated 2 years ago
- Repository for analysis and experiments in the BigCode project.β128Mar 20, 2024Updated 2 years ago
- This is the official code for the paper CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning (Neurβ¦β565Jan 21, 2025Updated last year
- distributed trainer for LLMsβ589May 20, 2024Updated last year
- Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.β1,010Jul 29, 2024Updated last year
- Wordpress hosting with auto-scaling on Cloudways β’ AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- β26Mar 6, 2024Updated 2 years ago
- CodeTF: One-stop Transformer Library for State-of-the-art Code LLMβ1,480May 1, 2025Updated 11 months ago
- APPS: Automated Programming Progress Standard (NeurIPS 2021)β523Jun 19, 2024Updated last year
- Dromedary: towards helpful, ethical and reliable LLMs.β1,142Sep 18, 2025Updated 6 months ago
- C++ implementation for π«StarCoderβ458Sep 9, 2023Updated 2 years ago
- β1,505May 12, 2023Updated 2 years ago
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.β39,447Jun 2, 2025Updated 10 months ago
- β39Oct 3, 2022Updated 3 years ago
- Large Language Model Text Generation Inferenceβ10,830Mar 21, 2026Updated 3 weeks ago
- NordVPN Special Discount Offer β’ AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adβ¦β6,080Jul 1, 2025Updated 9 months ago
- Ongoing research training transformer models at scaleβ15,985Updated this week
- A multi-programming language benchmark for LLMsβ301Jan 28, 2026Updated 2 months ago
- A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)β4,743Jan 8, 2024Updated 2 years ago
- β282Apr 25, 2023Updated 2 years ago
- Astraios: Parameter-Efficient Instruction Tuning Code Language Modelsβ63Apr 10, 2024Updated 2 years ago
- Code used for sourcing and cleaning the BigScience ROOTS corpusβ317Mar 20, 2023Updated 3 years ago
- β12Oct 7, 2023Updated 2 years ago
- β19Mar 1, 2024Updated 2 years ago
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed librariesβ7,411Feb 3, 2026Updated 2 months ago
- Code for the TMLR 2023 paper "PPOCoder: Execution-based Code Generation using Deep Reinforcement Learning"β117Jan 9, 2024Updated 2 years ago
- Scaling Data-Constrained Language Modelsβ343Jun 28, 2025Updated 9 months ago
- LLM powered development for VSCodeβ1,315Apr 2, 2026Updated last week
- Home of CodeT5: Open Code LLMs for Code Understanding and Generationβ3,100Jan 20, 2024Updated 2 years ago
- Accessible large language models via k-bit quantization for PyTorch.β8,107Updated this week
- Code for the paper "Evaluating Large Language Models Trained on Code"β3,188Jan 17, 2025Updated last year