ymetz / rlhfblenderLinks

RLHF-Blender: A Configurable Interactive Interface for Learning from Diverse Human Feedback

☆12

Alternatives and similar repositories for rlhfblender

Users that are interested in rlhfblender are comparing it to the libraries listed below

Sorting:

Qualcomm-AI-research / codeit
☆26Updated last year
princeton-nlp / lwm
We develop world models that can be adapted with natural language. Intergrating these models into artificial agents allows humans to effe…
☆23Updated last year
btnorman / First-Explore
Repo to reproduce the First-Explore paper results
☆37Updated 5 months ago
itl-ed / llm-dp
LLM Dynamic Planner - Combining LLM with PDDL Planners to solve an embodied task
☆44Updated 5 months ago
Holmeswww / SPRING
☆14Updated last year
kiddyboots216 / lottery-ticket-adaptation
Lottery Ticket Adaptation
☆39Updated 6 months ago
locross93 / Hypothetical-Minds
Hypothetical Minds is an autonomous LLM-based agent for diverse multi-agent settings, integrating a Theory of Mind module Theory of Mind …
☆30Updated 10 months ago
you68681 / GPAR
☆23Updated last year
SamsungSAILMontreal / nino
Code for "Accelerating Training with Neuron Interaction and Nowcasting Networks" [to appear at ICLR 2025]
☆19Updated 2 weeks ago
ThomasRochefortB / torch-gato
Pytorch implementation of the Gato paper from Deepmind
☆12Updated 2 years ago
ctlllll / reward_collapse
☆27Updated 2 years ago
multimodal-interpretability / FIND
Official implementation of FIND (NeurIPS '23) Function Interpretation Benchmark and Automated Interpretability Agents
☆49Updated 8 months ago
EleutherAI / mdl
Minimum Description Length probing for neural network representations
☆19Updated 4 months ago
jonasrothfuss / DeepEpisodicMemory
Deep neural network architecture for representing robot experiences in an episodic-like memory which facilitates encoding, recalling, and…
☆16Updated 6 years ago
Fchaubard / gradient_agreement_filtering
This is the official repo for Gradient Agreement Filtering (GAF).
☆24Updated 4 months ago
apple / ml-entity-deduction-arena
☆32Updated last year
abaheti95 / LoL-RL
Advantage Leftover Lunch Reinforcement Learning (A-LoL RL): Improving Language Models with Advantage-based Offline Policy Gradients
☆26Updated 8 months ago
benediktstroebl / agent-evals
☆19Updated last week
astanic / crafter-ood
☆20Updated 2 years ago
FLAIROx / cultural-accumulation
☆13Updated 10 months ago
SalesforceAIResearch / text2data
☆21Updated 3 months ago
scottlogic-alex / prm800k-denorm
Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format
☆27Updated last year
CEC-Agent / CEC
Official Implementation of NeurIPS'23 Paper "Cross-Episodic Curriculum for Transformer Agents"
☆31Updated last year
Shalev-Lifshitz / MultiAgentVerification
Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers
☆18Updated 3 months ago
IBM / abductive-rule-learner-with-context-awareness
ARLC, a probabilistic abductive reasoner for solving Raven's progressive matrices.
☆18Updated last month
RewardReports / reward-reports
Documentation for dynamic machine learning systems.
☆29Updated 8 months ago
etimush / ARC_NCA
Repo for solving arc problems with an Neural Cellular Automata
☆15Updated 2 weeks ago
WorldEditors / EvolvingPlasticANN
Codes for Evolving Plastic ANNs
☆13Updated 2 years ago
keraJLi / synthetic-gymnax
Drop-in environment replacements that make your RL algorithm train faster.
☆20Updated 11 months ago
Farama-Foundation / CrowdPlay
A web based platform for collecting human actions in reinforcement learning environments
☆30Updated last year