lvwerra / rl-implementationsLinks

This repo contains a set of notebooks to reproduce reinforcement learning algorithms.

☆15

Alternatives and similar repositories for rl-implementations

Users that are interested in rl-implementations are comparing it to the libraries listed below

Sorting:

Sea-Snell / CALM-Dialogue
Official code for the paper "Context-Aware Language Modeling for Goal-Oriented Dialogue Systems"
☆34Updated 2 years ago
google-research / jestimator
Amos optimizer with JEstimator lib.
☆82Updated last year
lucidrains / PaLM-jax
Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways - in Jax (Equinox framework)
☆189Updated 3 years ago
Sea-Snell / JAXSeq
Train very large language models in Jax.
☆210Updated 2 years ago
google-research / cascades
Python library which enables complex compositions of language models such as scratchpads, chain of thought, tool use, selection-inference…
☆215Updated 5 months ago
huggingface / simulate
🎢 Creating and sharing simulation environments for embodied and synthetic data research
☆191Updated 2 years ago
huggingface / ml-agents
Unity Machine Learning Agents Toolkit
☆48Updated 2 years ago
xrsrke / pipegoose
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
☆87Updated last year
augustwester / transformer-xl
A lightweight PyTorch implementation of the Transformer-XL architecture proposed by Dai et al. (2019)
☆37Updated 2 years ago
aks2203 / easy-to-hard
Official repository for the paper "Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks"
☆59Updated 3 years ago
HomebrewML / HomebrewNLP-torch
A case study of efficient training of large language models using commodity hardware.
☆68Updated 3 years ago
AI21Labs / lm-evaluation
Evaluation suite for large-scale language models.
☆128Updated 4 years ago
hundredblocks / large-model-parallelism
Functional local implementations of main model parallelism approaches
☆96Updated 2 years ago
EleutherAI / DeeperSpeed
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.
☆171Updated 2 months ago
Sea-Snell / JAX_llama
Inference code for LLaMA models in JAX
☆120Updated last year
Sea-Snell / Implicit-Language-Q-Learning
Official code from the paper "Offline RL for Natural Language Generation with Implicit Language Q Learning"
☆210Updated 2 years ago
princeton-nlp / TransformerPrograms
[NeurIPS 2023] Learning Transformer Programs
☆162Updated last year
google-deepmind / enn_acme
☆31Updated 3 years ago
tomekkorbak / pretraining-with-human-feedback
Code accompanying the paper Pretraining Language Models with Human Preferences
☆180Updated last year
google-deepmind / transformer_grammars
Transformer Grammars: Augmenting Transformer Language Models with Syntactic Inductive Biases at Scale, TACL (2022)
☆133Updated 5 months ago
volotat / ARC-Game
The Abstraction and Reasoning Corpus made into a web game
☆90Updated last year
CarperAI / Algorithm-Distillation-RLHF
☆35Updated 2 years ago
lucidrains / ponder-transformer
Implementation of a Transformer that Ponders, using the scheme from the PonderNet paper
☆81Updated 4 years ago
EleutherAI / lm_perplexity
☆160Updated 4 years ago
ekinakyurek / google-research
Google Research
☆46Updated 3 years ago
huggingface / huggingface_sb3
Additional code for Stable-baselines3 to load and upload models from the Hub.
☆88Updated last year
zphang / minimal-opt
☆67Updated 3 years ago
rovle / gpt3-in-context-fitting
Experiments on GPT-3's ability to fit numerical models in-context.
☆14Updated 3 years ago
microsoft / mutransformers
some common Huggingface transformers in maximal update parametrization (µP)
☆87Updated 3 years ago
teddykoker / tinyloader
☆68Updated 8 months ago