EleutherAI / gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
☆6,937Updated this week
Related projects ⓘ
Alternatives and complementary repositories for gpt-neox
- Model parallel transformers in JAX and Haiku☆6,291Updated last year
- An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.☆8,233Updated 2 years ago
- Running large language models on a single GPU for throughput-oriented scenarios.☆9,186Updated last week
- Repo for external large-scale work☆6,513Updated 6 months ago
- RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best…☆12,630Updated last week
- QLoRA: Efficient Finetuning of Quantized LLMs☆10,036Updated 4 months ago
- A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)☆4,500Updated 10 months ago
- Accessible large language models via k-bit quantization for PyTorch.☆6,260Updated this week
- ☆2,677Updated last week
- The RedPajama-Data repository contains code for preparing large datasets for training large language models.☆4,568Updated 3 weeks ago
- Instruct-tune LLaMA on consumer hardware☆18,634Updated 3 months ago
- Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Ad…☆5,988Updated 2 months ago
- Ongoing research training transformer models at scale☆10,497Updated this week
- StableLM: Stability AI Language Models☆15,831Updated 7 months ago
- LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath☆9,258Updated 3 months ago
- LLaMA: Open and Efficient Foundation Language Models☆2,807Updated last year
- Hackable and optimized Transformers building blocks, supporting a composable construction.☆8,612Updated this week
- 🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (i…☆7,919Updated this week
- ☆9,003Updated 7 months ago
- OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset☆7,384Updated last year
- CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.☆4,931Updated 7 months ago
- 4 bits quantization of LLaMA using GPTQ☆2,993Updated 3 months ago
- GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)☆7,658Updated last year
- Training and serving large-scale neural networks with auto parallelization.☆3,073Updated 11 months ago
- 🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading☆9,224Updated 2 months ago
- An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.☆4,458Updated last month
- Train transformer language models with reinforcement learning.☆9,990Updated this week
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.☆36,902Updated this week
- The hub for EleutherAI's work on interpretability and learning dynamics☆2,266Updated last week