zphang / minimal-gpt-neox-20bLinks

☆130

Alternatives and similar repositories for minimal-gpt-neox-20b

Users that are interested in minimal-gpt-neox-20b are comparing it to the libraries listed below

Sorting:

Rallio67 / language-model-agents
Experiments with generating opensource language model assistants
☆97Updated 2 years ago
kyleliang919 / Long-context-transformers
Exploring finetuning public checkpoints on filter 8K sequences on Pile
☆116Updated 2 years ago
labmlai / neox
Simple Annotated implementation of GPT-NeoX in PyTorch
☆110Updated 2 years ago
EleutherAI / magiCARP
One stop shop for all things carp
☆59Updated 2 years ago
EleutherAI / DeeperSpeed
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.
☆168Updated 2 weeks ago
CarperAI / cheese
Used for adaptive human in the loop evaluation of language and embedding models.
☆311Updated 2 years ago
huggingface / bloom-jax-inference
☆67Updated 3 years ago
google-research-datasets / presto
A Multilingual Dataset for Parsing Realistic Task-Oriented Dialogs
☆115Updated 2 years ago
leogao2 / lm_dataformat
☆79Updated last year
BlinkDL / RWKV-v2-RNN-Pile
RWKV-v2-RNN trained on the Pile. See https://github.com/BlinkDL/RWKV-LM for details.
☆67Updated 2 years ago
lucidrains / PaLM-jax
Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways - in Jax (Equinox framework)
☆187Updated 3 years ago
huggingface / olm-datasets
Pipeline for pulling and processing online language model pretraining data from the web
☆177Updated 2 years ago
Xirider / finetune-gpt2xl
Guide: Finetune GPT2-XL (1.5 Billion Parameters) and finetune GPT-NEO (2.7 B) on a single GPU with Huggingface Transformers using DeepSpe…
☆437Updated 2 years ago
JulesGM / ParlAI_SearchEngine
A search engine for ParlAI's BlenderBot project (and probably other ones as well)
☆130Updated 3 years ago
zphang / minimal-opt
☆67Updated 2 years ago
shawwn / tpunicorn
Babysit your preemptible TPUs
☆86Updated 2 years ago
huu4ontocord / MDEL
Multi-Domain Expert Learning
☆67Updated last year
harrisonvanderbyl / rwkvstic
Framework agnostic python runtime for RWKV models
☆146Updated last year
ArEnSc / Production-RWKV
This project aims to make RWKV Accessible to everyone using a Hugging Face like interface, while keeping it close to the R and D RWKV bra…
☆65Updated 2 years ago
microsoft / xtreme-distil-transformers
XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale
☆155Updated last year
AI21Labs / lm-evaluation
Evaluation suite for large-scale language models.
☆127Updated 3 years ago
kernelmachine / cbtm
Code repository for the c-BTM paper
☆107Updated last year
SeanNaren / minGPT
A minimal PyTorch Lightning OpenAI GPT w DeepSpeed Training!
☆112Updated 2 years ago
rom1504 / cc2dataset
Easily convert common crawl to a dataset of caption and document. Image/text Audio/text Video/text, ...
☆317Updated last year
NolanoOrg / smol-gpt
Smol but mighty language model
☆62Updated 2 years ago
EleutherAI / openwebtext2
☆90Updated 3 years ago
sanjeevanahilan / nanoChatGPT
A crude RLHF layer on top of nanoGPT with Gumbel-Softmax trick
☆291Updated last year
AeroScripts / HiddenEngrams
Hidden Engrams: Long Term Memory for Transformer Model Inference
☆35Updated 4 years ago
gustavecortal / gpt-j-fine-tuning-example
Fine-tuning 6-Billion GPT-J (& other models) with LoRA and 8-bit compression
☆66Updated 2 years ago
patil-suraj / vqgan-jax
JAX implementation of VQGAN
☆92Updated 3 years ago