agi-templar / Stable-AlignmentLinks

Multi-agent Social Simulation + Efficient, Effective, and Stable alternative of RLHF. Code for the paper "Training Socially Aligned Language Models in Simulated Human Society".

☆351

Alternatives and similar repositories for Stable-Alignment

Users that are interested in Stable-Alignment are comparing it to the libraries listed below

Sorting:

InteractiveNLP-Team / awesome-InteractiveNLP-papers
Paper List for a new paradigm of NLP: Interactive NLP (https://arxiv.org/abs/2305.13246)
☆213Updated 2 years ago
tatsu-lab / alpaca_farm
A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.
☆825Updated last year
allenai / FineGrainedRLHF
☆280Updated 9 months ago
OpenBMB / UltraFeedback
A large-scale, fine-grained, diverse preference dataset (and models).
☆353Updated last year
Ber666 / ToolkenGPT
ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings - NeurIPS 2023 (oral)
☆262Updated last year
anthropics / ConstitutionalHarmlessnessPaper
☆241Updated 2 years ago
GanjinZero / RRHF
[NIPS2023] RRHF & Wombat
☆811Updated 2 years ago
haoliuhl / chain-of-hindsight
Simple next-token-prediction for RLHF
☆226Updated 2 years ago
chuanyang-Zheng / Progressive-Hint
This is the official implementation of "Progressive-Hint Prompting Improves Reasoning in Large Language Models"
☆209Updated 2 years ago
GAIR-NLP / auto-j
Generative Judge for Evaluating Alignment
☆246Updated last year
anchen1011 / FireAct
FireAct: Toward Language Agent Fine-tuning
☆281Updated last year
OFA-Sys / gsm8k-ScRel
Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models
☆267Updated last year
suzgunmirac / BIG-Bench-Hard
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
☆514Updated last year
xlang-ai / xlang-paper-reading
Paper collection on building and evaluating language model agents via executable language grounding
☆362Updated last year
FranxYao / GPT-Bargaining
Code for Arxiv 2023: Improving Language Model Negociation with Self-Play and In-Context Learning from AI Feedback
☆206Updated 2 years ago
TIGER-AI-Lab / MAmmoTH
Code and data for "MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning" [ICLR 2024]
☆376Updated last year
jayelm / gisting
Learning to Compress Prompts with Gist Tokens - https://arxiv.org/abs/2304.08467
☆295Updated 7 months ago
facebookresearch / Shepherd
This is the repo for the paper Shepherd -- A Critic for Language Model Generation
☆217Updated 2 years ago
night-chen / ToolQA
ToolQA, a new dataset to evaluate the capabilities of LLMs in answering challenging questions with external tools. It offers two levels …
☆278Updated 2 years ago
raunak-agarwal / instruction-datasets
Datasets for Instruction Tuning of Large Language Models
☆257Updated last year
SwiftSage / SwiftSage
SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks
☆315Updated 11 months ago
kojima-takeshi188 / zero_shot_cot
Prod Env
☆430Updated 2 years ago
ruixiangcui / AGIEval
☆763Updated last year
XueyangFeng / LLM-Agent-Paper-Digest
papers related to LLM-agent that published on top conferences
☆319Updated 5 months ago
Spico197 / Humpback
🐋 An unofficial implementation of Self-Alignment with Instruction Backtranslation.
☆139Updated 5 months ago
xingyaoww / mint-bench
Official Repo for ICLR 2024 paper MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback by Xingyao Wang*, Ziha…
☆130Updated last year
iiis-ai / cumulative-reasoning
[TMLR] Cumulative Reasoning With Large Language Models (https://arxiv.org/abs/2308.04371)
☆302Updated 2 months ago
mingkaid / rl-prompt
Accompanying repo for the RLPrompt paper
☆355Updated last year
TIGER-AI-Lab / Program-of-Thoughts
Data and Code for Program of Thoughts [TMLR 2023]
☆287Updated last year
Ber666 / RAP
Reasoning with Language Model is Planning with World Model
☆175Updated 2 years ago