bigscience-workshop / bigscienceLinks

Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.

☆1,006

Alternatives and similar repositories for bigscience

Users that are interested in bigscience are comparing it to the libraries listed below

Sorting:

bigscience-workshop / Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
☆1,406Updated last year
huggingface / transformers-bloom-inference
Fast Inference Solutions for BLOOM
☆563Updated 9 months ago
epfLLM / Megatron-LLM
distributed trainer for LLMs
☆578Updated last year
lucidrains / RETRO-pytorch
Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch
☆870Updated last year
google-research / FLAN
☆1,532Updated 3 weeks ago
EleutherAI / the-pile
☆1,589Updated 2 years ago
tunib-ai / parallelformers
Parallelformers: An Efficient Model Parallelization Toolkit for Deployment
☆790Updated 2 years ago
facebookresearch / cc_net
Tools to download and cleanup Common Crawl data
☆1,021Updated 2 years ago
lucidrains / PaLM-pytorch
Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways
☆823Updated 2 years ago
google-research / deduplicate-text-datasets
☆1,232Updated last year
bigscience-workshop / xmtf
Crosslingual Generalization through Multitask Finetuning
☆537Updated 10 months ago
ELS-RD / kernl
Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackab…
☆1,577Updated last year
BlackSamorez / tensor_parallel
Automatically split your PyTorch models on multiple GPUs for training & inference
☆657Updated last year
allenai / natural-instructions
Expanding natural instructions
☆1,011Updated last year
google / seqio
Task-based datasets, preprocessing, and evaluation for sequence models.
☆583Updated last week
stanford-crfm / mistral
Mistral: A strong, northwesterly wind: Framework for transparent and accessible large-scale language model training, built with Hugging F…
☆575Updated last year
JonasGeiping / cramming
Cramming the training of a (BERT-type) language model into limited compute.
☆1,341Updated last year
bigscience-workshop / data-preparation
Code used for sourcing and cleaning the BigScience ROOTS corpus
☆313Updated 2 years ago
openai / summarize-from-feedback
Code for "Learning to summarize from human feedback"
☆1,037Updated last year
microsoft / mup
maximal update parametrization (µP)
☆1,576Updated last year
bigscience-workshop / t-zero
Reproduce results and replicate training fo T0 (Multitask Prompted Training Enables Zero-Shot Task Generalization)
☆462Updated 2 years ago
ofirpress / attention_with_linear_biases
Code for the ALiBi method for transformer language models (ICLR 2022)
☆538Updated last year
facebookresearch / fairscale
PyTorch extensions for high performance and large scale training.
☆3,352Updated 3 months ago
deepspeedai / Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
☆2,126Updated 3 weeks ago
huggingface / llm_training_handbook
An open collection of methodologies to help with successful training of large language models.
☆507Updated last year
NVIDIA / NeMo-Aligner
Scalable toolkit for efficient model alignment
☆834Updated last week
ChenghaoMou / text-dedup
All-in-one text de-duplication
☆706Updated last week
huggingface / large_language_model_training_playbook
An open collection of implementation tips, tricks and resources for training large language models
☆478Updated 2 years ago
allenai / RL4LMs
A modular RL library to fine-tune language models to human preferences
☆2,333Updated last year
ELS-RD / transformer-deploy
Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀
☆1,687Updated 9 months ago