labmlai / neox
Simple Annotated implementation of GPT-NeoX in PyTorch
☆110Updated 2 years ago
Related projects: ⓘ
- ☆127Updated 2 years ago
- Used for adaptive human in the loop evaluation of language and embedding models.☆300Updated last year
- DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.☆163Updated 4 months ago
- Experiments with generating opensource language model assistants☆97Updated last year
- Exploring finetuning public checkpoints on filter 8K sequences on Pile☆115Updated last year
- One stop shop for all things carp☆58Updated 2 years ago
- 🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.☆55Updated 2 years ago
- A Multilingual Dataset for Parsing Realistic Task-Oriented Dialogs☆112Updated last year
- Smol but mighty language model☆62Updated last year
- Reimplementation of the task generation part from the Alpaca paper☆118Updated last year
- [WIP] A 🔥 interface for running code in the cloud☆86Updated last year
- ☆28Updated this week
- The GeoV model is a large langauge model designed by Georges Harik and uses Rotary Positional Embeddings with Relative distances (RoPER).…☆122Updated last year
- Babysit your preemptible TPUs☆84Updated last year
- Pipeline for pulling and processing online language model pretraining data from the web☆172Updated last year
- ☆64Updated 2 years ago
- Fine-tuning 6-Billion GPT-J (& other models) with LoRA and 8-bit compression☆65Updated last year
- RWKV-v2-RNN trained on the Pile. See https://github.com/BlinkDL/RWKV-LM for details.☆65Updated 2 years ago
- ☆50Updated last year
- ☆61Updated last year
- MiniHF is an inference, human preference data collection, and fine-tuning tool for local language models. It is intended to help the user…☆143Updated this week
- Guide: Finetune GPT2-XL (1.5 Billion Parameters) and finetune GPT-NEO (2.7 B) on a single GPU with Huggingface Transformers using DeepSpe…☆428Updated last year
- Tune MPTs☆84Updated last year
- Fine-tuning GPT-J-6B on colab or equivalent PC GPU with your custom datasets: 8-bit weights with low-rank adaptors (LoRA)☆74Updated 2 years ago
- Swarm training framework using Haiku + JAX + Ray for layer parallel transformer language models on unreliable, heterogeneous nodes☆236Updated last year
- Datasets collection and preprocessings framework for NLP extreme multitask learning☆143Updated 2 months ago
- Instruct-tuning LLaMA on consumer hardware☆66Updated last year
- ☆144Updated 3 years ago
- Train very large language models in Jax.☆191Updated 10 months ago
- Simple Python client for the Hugging Face Inference API☆69Updated 4 years ago