EleutherAI / the-pileLinks

☆1,614

Alternatives and similar repositories for the-pile

Users that are interested in the-pile are comparing it to the libraries listed below

Sorting:

bigscience-workshop / bigscience
Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.
☆1,007Updated last year
google / BIG-bench
Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language models
☆3,147Updated last year
google-research / FLAN
☆1,552Updated 3 weeks ago
lucidrains / RETRO-pytorch
Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch
☆876Updated 2 years ago
allenai / natural-instructions
Expanding natural instructions
☆1,025Updated last year
EleutherAI / pythia
The hub for EleutherAI's work on interpretability and learning dynamics
☆2,676Updated last week
openai / summarize-from-feedback
Code for "Learning to summarize from human feedback"
☆1,053Updated 2 years ago
google-research / deduplicate-text-datasets
☆1,251Updated last year
lucidrains / PaLM-pytorch
Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways
☆826Updated 3 years ago
bigscience-workshop / Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
☆1,427Updated last year
bigscience-workshop / promptsource
Toolkit for creating, sharing and using natural language prompts.
☆2,972Updated 2 years ago
JonasGeiping / cramming
Cramming the training of a (BERT-type) language model into limited compute.
☆1,351Updated last year
Xirider / finetune-gpt2xl
Guide: Finetune GPT2-XL (1.5 Billion Parameters) and finetune GPT-NEO (2.7 B) on a single GPU with Huggingface Transformers using DeepSpe…
☆437Updated 2 years ago
google-research / t5x
☆2,907Updated last week
jcpeterson / openwebtext
Open clone of OpenAI's unreleased WebText dataset scraper. This version uses pushshift.io files instead of the API for speed.
☆744Updated 2 years ago
facebookresearch / cc_net
Tools to download and cleanup Common Crawl data
☆1,030Updated 2 years ago
google-research / prompt-tuning
Original Implementation of Prompt Tuning from Lester, et al, 2021
☆696Updated 8 months ago
bigscience-workshop / xmtf
Crosslingual Generalization through Multitask Finetuning
☆537Updated last year
anthropics / hh-rlhf
Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"
☆1,799Updated 5 months ago
huggingface / transformers-bloom-inference
Fast Inference Solutions for BLOOM
☆564Updated last year
openai / following-instructions-human-feedback
☆1,247Updated 2 years ago
stanford-crfm / mistral
Mistral: A strong, northwesterly wind: Framework for transparent and accessible large-scale language model training, built with Hugging F…
☆575Updated 2 years ago
bigscience-workshop / t-zero
Reproduce results and replicate training fo T0 (Multitask Prompted Training Enables Zero-Shot Task Generalization)
☆462Updated 3 years ago
CarperAI / trlx
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
☆4,728Updated last year
allenai / RL4LMs
A modular RL library to fine-tune language models to human preferences
☆2,367Updated last year
microsoft / DeBERTa
The implementation of DeBERTa
☆2,165Updated 2 years ago
hendrycks / test
Measuring Massive Multitask Language Understanding | ICLR 2021
☆1,518Updated 2 years ago
microsoft / GODEL
Large-scale pretrained models for goal-directed dialog
☆885Updated last year
timoschick / pet
This repository contains the code for "Exploiting Cloze Questions for Few-Shot Text Classification and Natural Language Inference"
☆1,629Updated 2 years ago
young-geng / EasyLM
Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…
☆2,501Updated last year