paulcjh / gpt-j-6b
☆50Updated last year
Related projects: ⓘ
- ☆31Updated last year
- Experiments with generating opensource language model assistants☆97Updated last year
- ☆127Updated 2 years ago
- XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale☆152Updated 8 months ago
- 🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.☆55Updated 2 years ago
- ☆32Updated last year
- Simple Annotated implementation of GPT-NeoX in PyTorch☆110Updated 2 years ago
- A diff tool for language models☆42Updated 8 months ago
- One stop shop for all things carp☆58Updated 2 years ago
- Official repo for NAACL 2024 Findings paper "LeTI: Learning to Generate from Textual Interactions."☆60Updated last year
- ☆91Updated 5 months ago
- Tutorial to pretrain & fine-tune a 🤗 Flax T5 model on a TPUv3-8 with GCP☆58Updated 2 years ago
- ☆28Updated this week
- Exploring finetuning public checkpoints on filter 8K sequences on Pile☆115Updated last year
- Our open source implementation of MiniLMv2 (https://aclanthology.org/2021.findings-acl.188)☆59Updated last year
- The elegant integration of huggingface/nlp and fastai2 and handy transforms using pure huggingface/nlp☆19Updated 3 years ago
- DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.☆163Updated 4 months ago
- Hidden Engrams: Long Term Memory for Transformer Model Inference☆34Updated 3 years ago
- A dataset of alignment research and code to reproduce it☆68Updated last year
- ☆22Updated 3 years ago
- ☆34Updated last year
- Used for adaptive human in the loop evaluation of language and embedding models.☆300Updated last year
- A library for squeakily cleaning and filtering language datasets.☆45Updated last year
- Code for the paper-"Mirostat: A Perplexity-Controlled Neural Text Decoding Algorithm" (https://arxiv.org/abs/2007.14966).☆56Updated 2 years ago
- Training & Implementation of chatbots leveraging GPT-like architecture with the aitextgen package to enable dynamic conversations.☆46Updated 2 years ago
- Source codes for the paper "Bounding the Capabilities of Large Language Models in Open Text Generation with Prompt Constraints"☆27Updated last year
- ☆22Updated last year
- ☆9Updated 3 years ago
- Adversarial Training and SFT for Bot Safety Models☆38Updated last year
- A package for fine-tuning Transformers with TPUs, written in Tensorflow2.0+☆37Updated 3 years ago