bigcode-project / Megatron-LM
Ongoing research training transformer models at scale
☆380Updated 5 months ago
Alternatives and similar repositories for Megatron-LM:
Users that are interested in Megatron-LM are comparing it to the libraries listed below
- Dromedary: towards helpful, ethical and reliable LLMs.☆1,133Updated last year
- Code for fine-tuning Platypus fam LLMs using LoRA☆626Updated 11 months ago
- ☆266Updated last year
- Official repository for LongChat and LongEval☆519Updated 8 months ago
- ☆350Updated last year
- [ICLR 2024] Lemur: Open Foundation Models for Language Agents☆540Updated last year
- Fine-tune SantaCoder for Code/Text Generation.☆188Updated last year
- This repository contains code and tooling for the Abacus.AI LLM Context Expansion project. Also included are evaluation scripts and bench…☆583Updated last year
- C++ implementation for 💫StarCoder☆450Updated last year
- CodeGen2 models for program synthesis☆274Updated last year
- Salesforce open-source LLMs with 8k sequence length.☆715Updated this week
- Fast Inference Solutions for BLOOM☆563Updated 3 months ago
- Tune any FALCON in 4-bit☆466Updated last year
- LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions☆815Updated last year
- Crosslingual Generalization through Multitask Finetuning☆524Updated 4 months ago
- [ACL2023] We introduce LLM-Blender, an innovative ensembling framework to attain consistently superior performance by leveraging the dive…☆903Updated 3 months ago
- OpenAlpaca: A Fully Open-Source Instruction-Following Model Based On OpenLLaMA☆300Updated last year
- ☆456Updated last year
- ☆393Updated 5 months ago
- This repository contains code for extending the Stanford Alpaca synthetic instruction tuning to existing instruction-tuned models such as…☆351Updated last year
- Run evaluation on LLMs using human-eval benchmark☆393Updated last year
- Inference code for Mistral and Mixtral hacked up into original Llama implementation☆371Updated last year
- Landmark Attention: Random-Access Infinite Context Length for Transformers☆420Updated last year
- ☆124Updated last year
- 🐙 OctoPack: Instruction Tuning Code Large Language Models☆449Updated 4 months ago
- ☆724Updated 7 months ago
- Extend existing LLMs way beyond the original training length with constant memory usage, without retraining☆687Updated 9 months ago
- Minimal library to train LLMs on TPU in JAX with pjit().☆280Updated last year
- A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.☆792Updated 6 months ago
- YaRN: Efficient Context Window Extension of Large Language Models☆1,405Updated 9 months ago