EleutherAI / DeeperSpeedLinks

DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

☆169

Alternatives and similar repositories for DeeperSpeed

Users that are interested in DeeperSpeed are comparing it to the libraries listed below

Sorting:

huggingface / bloom-jax-inference
☆66Updated 3 years ago
zphang / minimal-gpt-neox-20b
☆131Updated 3 years ago
CarperAI / cheese
Used for adaptive human in the loop evaluation of language and embedding models.
☆307Updated 2 years ago
lucidrains / PaLM-jax
Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways - in Jax (Equinox framework)
☆188Updated 3 years ago
Rallio67 / language-model-agents
Experiments with generating opensource language model assistants
☆97Updated 2 years ago
Sea-Snell / JAX_llama
Inference code for LLaMA models in JAX
☆119Updated last year
huggingface / optimum-graphcore
Blazing fast training of 🤗 Transformers on Graphcore IPUs
☆85Updated last year
SeanNaren / min-LLM
Minimal code to train a Large Language Model (LLM).
☆172Updated 3 years ago
kyleliang919 / Long-context-transformers
Exploring finetuning public checkpoints on filter 8K sequences on Pile
☆115Updated 2 years ago
ayaka14732 / llama-2-jax
JAX implementation of the Llama 2 model
☆216Updated last year
google-research-datasets / presto
A Multilingual Dataset for Parsing Realistic Task-Oriented Dialogs
☆115Updated 2 years ago
kernelmachine / cbtm
Code repository for the c-BTM paper
☆107Updated 2 years ago
EleutherAI / openwebtext2
☆91Updated 3 years ago
LAION-AI / blade2blade
Adversarial Training and SFT for Bot Safety Models
☆40Updated 2 years ago
kingoflolz / swarm-jax
Swarm training framework using Haiku + JAX + Ray for layer parallel transformer language models on unreliable, heterogeneous nodes
☆242Updated 2 years ago
labmlai / neox
Simple Annotated implementation of GPT-NeoX in PyTorch
☆110Updated 3 years ago
huggingface / olm-datasets
Pipeline for pulling and processing online language model pretraining data from the web
☆177Updated 2 years ago
google-research / jestimator
Amos optimizer with JEstimator lib.
☆82Updated last year
xrsrke / pipegoose
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
☆87Updated last year
Sea-Snell / JAXSeq
Train very large language models in Jax.
☆209Updated 2 years ago
geov-ai / geov
The GeoV model is a large langauge model designed by Georges Harik and uses Rotary Positional Embeddings with Relative distances (RoPER).…
☆121Updated 2 years ago
CarperAI / InstructGPT
For experiments involving instruct gpt. Currently used for documenting open research questions.
☆70Updated 2 years ago
yandex-research / DeDLOC
Official code for "Distributed Deep Learning in Open Collaborations" (NeurIPS 2021)
☆117Updated 3 years ago
salesforce / jaxformer
Minimal library to train LLMs on TPU in JAX with pjit().
☆298Updated last year
leogao2 / lm_dataformat
☆79Updated last year
huggingface / fuego
[WIP] A 🔥 interface for running code in the cloud
☆85Updated 2 years ago
huggingface / transformers_bloom_parallel
Techniques used to run BLOOM at inference in parallel
☆37Updated 3 years ago
AI21Labs / lm-evaluation
Evaluation suite for large-scale language models.
☆128Updated 4 years ago
google-research / cascades
Python library which enables complex compositions of language models such as scratchpads, chain of thought, tool use, selection-inference…
☆215Updated 4 months ago
r-three / git-theta
git extension for {collaborative, communal, continual} model development
☆215Updated 11 months ago