zphang / minimal-gpt-neox-20b
☆127Updated 2 years ago
Related projects: ⓘ
- Experiments with generating opensource language model assistants☆97Updated last year
- Exploring finetuning public checkpoints on filter 8K sequences on Pile☆115Updated last year
- Used for adaptive human in the loop evaluation of language and embedding models.☆300Updated last year
- Simple Annotated implementation of GPT-NeoX in PyTorch☆110Updated 2 years ago
- DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.☆163Updated 4 months ago
- ☆64Updated 2 years ago
- One stop shop for all things carp☆58Updated 2 years ago
- Techniques used to run BLOOM at inference in parallel☆37Updated last year
- A Multilingual Dataset for Parsing Realistic Task-Oriented Dialogs☆112Updated last year
- Pipeline for pulling and processing online language model pretraining data from the web☆172Updated last year
- MiniHF is an inference, human preference data collection, and fine-tuning tool for local language models. It is intended to help the user…☆143Updated this week
- See the issue board for the current status of active and prospective projects!☆65Updated 2 years ago
- Inference code for LLaMA models in JAX☆108Updated 3 months ago
- Datasets collection and preprocessings framework for NLP extreme multitask learning☆143Updated 2 months ago
- Code repository for the c-BTM paper☆105Updated 11 months ago
- Multi-Domain Expert Learning☆67Updated 7 months ago
- Swarm training framework using Haiku + JAX + Ray for layer parallel transformer language models on unreliable, heterogeneous nodes☆236Updated last year
- ☆75Updated 9 months ago
- Patch for MPT-7B which allows using and training a LoRA☆58Updated last year
- ☆67Updated 2 years ago
- Babysit your preemptible TPUs☆84Updated last year
- RWKV-v2-RNN trained on the Pile. See https://github.com/BlinkDL/RWKV-LM for details.☆65Updated 2 years ago
- Guide: Finetune GPT2-XL (1.5 Billion Parameters) and finetune GPT-NEO (2.7 B) on a single GPU with Huggingface Transformers using DeepSpe…☆428Updated last year
- Framework agnostic python runtime for RWKV models☆144Updated last year
- A search engine for ParlAI's BlenderBot project (and probably other ones as well)☆132Updated 2 years ago
- Reimplementation of the task generation part from the Alpaca paper☆118Updated last year
- The GeoV model is a large langauge model designed by Georges Harik and uses Rotary Positional Embeddings with Relative distances (RoPER).…☆122Updated last year
- ☆92Updated last year
- Train very large language models in Jax.☆191Updated 10 months ago
- Adversarial Training and SFT for Bot Safety Models☆38Updated last year