facebookresearch / diplomacy_cicero
Code for Cicero, an AI agent that plays the game of Diplomacy with open-domain natural language negotiation.
☆1,279Updated last year
Related projects: ⓘ
- Monte Carlo tree search in JAX☆2,312Updated last month
- Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways☆819Updated last year
- ☆943Updated 6 months ago
- A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)☆4,442Updated 8 months ago
- Model API for GALACTICA☆2,675Updated last year
- Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"☆1,561Updated last year
- The hub for EleutherAI's work on interpretability and learning dynamics☆2,210Updated 3 weeks ago
- Transformers are Sample-Efficient World Models. ICLR 2023, notable top 5%.☆788Updated 3 weeks ago
- Public repo for the NeurIPS 2023 paper "Unlimiformer: Long-Range Transformers with Unlimited Length Input"☆1,049Updated 6 months ago
- A suite of test scenarios for multi-agent reinforcement learning.☆583Updated this week
- Compositional Differentiable Programming Library☆957Updated this week
- Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos☆1,275Updated 3 months ago
- Building Open-Ended Embodied Agents with Internet-Scale Knowledge☆1,756Updated 6 months ago
- Code for Parsel 🐍 - generate complex programs with language models☆410Updated last year
- [NeurIPS 22] [AAAI 24] Recurrent Transformer-based long-context architecture.☆749Updated last month
- Cramming the training of a (BERT-type) language model into limited compute.☆1,284Updated 3 months ago
- ☆495Updated 7 months ago
- Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…☆2,368Updated last month
- Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.☆860Updated 9 months ago
- Train to 94% on CIFAR-10 in <6.3 seconds on a single A100. Or ~95.79% in ~110 seconds (or less!)☆1,214Updated 10 months ago
- A Production-ready Reinforcement Learning AI Agent Library brought by the Applied Reinforcement Learning team at Meta.☆2,581Updated this week
- 800,000 step-level correctness labels on LLM solutions to MATH problems☆1,441Updated last year
- Fast & Simple repository for pre-training and fine-tuning T5-style models☆957Updated 3 weeks ago
- Streamlining reinforcement learning with RLOps. State-of-the-art RL algorithms and tools.☆573Updated this week
- Code for the paper Fine-Tuning Language Models from Human Preferences☆1,204Updated last year
- Code for "Learning to summarize from human feedback"☆975Updated last year
- A curated list of reinforcement learning with human feedback resources (continually updated)☆3,248Updated 2 weeks ago
- ChatArena (or Chat Arena) is a Multi-Agent Language Game Environments for LLMs. The goal is to develop communication and collaboration ca…☆1,328Updated 3 months ago
- High throughput synchronous and asynchronous reinforcement learning☆795Updated 3 weeks ago
- ☆1,168Updated last year