thomasgauthier/LLM-self-play

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/thomasgauthier/LLM-self-play)

thomasgauthier / LLM-self-play

Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)

☆29

Alternatives and similar repositories for LLM-self-play

Users that are interested in LLM-self-play are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

uclaml / SPIN
View on GitHub
The official implementation of Self-Play Fine-Tuning (SPIN)
☆1,248May 8, 2024Updated 2 years ago
AxelSorensenDev / Eevee
View on GitHub
An Easy Annotation Tool for Natural Language Processing
☆12May 17, 2024Updated 2 years ago
openfeedback / superhf
View on GitHub
Open-source Human Feedback Library
☆11Oct 25, 2023Updated 2 years ago
Inspirateur / Fast-BM25
View on GitHub
a fast implementation of BM25
☆10Sep 15, 2022Updated 3 years ago
THUDM / Self-Contrast
View on GitHub
Extensive Self-Contrast Enables Feedback-Free Language Model Alignment
☆20Apr 2, 2024Updated 2 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
agwaBom / towards_moe
View on GitHub
Implementation of "Towards Understanding Mixture of Experts in Deep Learning", NeurIPS 2022
☆10Jan 6, 2023Updated 3 years ago
kobayashikanna01 / Chain-of-Discussion
View on GitHub
☆11May 28, 2024Updated 2 years ago
sofiaherrero / lime-ner
View on GitHub
lime-ner: extending LIME for Named Entity Recognition
☆10Aug 15, 2018Updated 7 years ago
lucidrains / self-rewarding-lm-pytorch
View on GitHub
Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI
☆1,411Apr 11, 2024Updated 2 years ago
erosenfeld / disagree_discrep
View on GitHub
Provably (and non-vacuously) bounding test error of deep neural networks under distribution shift with unlabeled test data.
☆10Feb 27, 2024Updated 2 years ago
shenao-zhang / reward-augmented-preference
View on GitHub
The official implementation of Preference Data Reward-Augmentation.
☆18May 1, 2025Updated last year
YJiangcm / BMC
View on GitHub
[ICLR 2025] Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization
☆12Jan 26, 2025Updated last year
kumar-shridhar / Screws
View on GitHub
SCREWS: A Modular Framework for Reasoning with Revisions
☆27Sep 26, 2023Updated 2 years ago
AndersonChoi / forza-telemetry-kafka-producer
View on GitHub
forza-telemetry-kafka-producer
☆10May 2, 2022Updated 4 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
brunocampos01 / federated-learning-for-text-generation
View on GitHub
Machine learning project using federated learning for text generation
☆11May 5, 2024Updated 2 years ago
epfml / REQ
View on GitHub
☆19Jun 10, 2024Updated 2 years ago
thomasgauthier / LoRD
View on GitHub
Low-Rank adapter extraction for fine-tuned transformers models
☆181May 2, 2024Updated 2 years ago
davidkim205 / translation
View on GitHub
☆13Apr 17, 2024Updated 2 years ago
kai3n / sentiment-analysis-imdb
View on GitHub
This is a classifier focused on sentiment analysis of movie reviews
☆13Jun 3, 2017Updated 9 years ago
robjsliwa / llama-agent
View on GitHub
Fun project to run your own LLM chat bot using llama.cpp
☆11Jun 9, 2023Updated 3 years ago
guobbin / PFL-MoE
View on GitHub
Federated Learning - PyTorch
☆15Jun 27, 2021Updated 5 years ago
susumuota / nano-askllm
View on GitHub
Unofficial implementation of the Ask-LLM paper 'How to Train Data-Efficient LLMs', arXiv:2402.09668.
☆12Jun 19, 2024Updated 2 years ago
osome-iu / AI_fact_checking
View on GitHub
We conduct a preregistered experiment to investigate whether fact checks provided by a large language model can serve as an effective mis…
☆13Dec 14, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
sebastianbergmann / docker-amigaos-cross-toolchain
View on GitHub
Dockerfile for AmigaOS Cross-Compiler Toolchain
☆11Mar 12, 2018Updated 8 years ago
hanningzhang / ER-PRM
View on GitHub
☆20Dec 14, 2024Updated last year
kai3n / Orroid
View on GitHub
It checks how secure the program you made is and shows how vulnerable your program is.
☆20Apr 20, 2017Updated 9 years ago
KyujinHan / Korean_selenium_DeepL
View on GitHub
DeepL을 통한 한국 번역 자동화 코드
☆12Jul 27, 2023Updated 2 years ago
menglinjian / Deep-FTRL-ORW
View on GitHub
Code for the paper "Deep FTRL-ORW: An Efficient Deep Reinforcement Learning Algorithm for Solving Imperfect Information Extensive-Form Ga…
☆11Dec 1, 2022Updated 3 years ago
casmlab / NPHardEval
View on GitHub
Repository for NPHardEval, a quantified-dynamic benchmark of LLMs
☆64Mar 26, 2024Updated 2 years ago
DmitrySoshnikov / at-regexp-machine
View on GitHub
Automata Theory. Building a RegExp machine
☆12May 10, 2019Updated 7 years ago
jxbz / entropix
View on GitHub
📰 Computing the information content of trained neural networks
☆24Oct 8, 2021Updated 4 years ago
brianmwangy / Beginner-Guide-to-Automated-Feature-Engineering-With-Deep-Feature-Synthesis.
View on GitHub
This is a comprehensive guide on how you can automate your feature engineering process.
☆11Jun 25, 2018Updated 8 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
RajKKapadia / YouTube-Document-GPT-WhatsApp
View on GitHub
☆12Sep 1, 2023Updated 2 years ago
SalesforceAIResearch / indict_code_gen
View on GitHub
INDICT: Code Generation with Internal Dialogues of Critiques for Both Security and Helpfulness
☆15Jun 2, 2026Updated last month
HanNight / soft_self_consistency
View on GitHub
Code for ACL 2024 paper "Soft Self-Consistency Improves Language Model Agents"
☆25Sep 11, 2024Updated last year
sanyalsunny111 / LLM-Inheritune
View on GitHub
[TMLR 2025] When Attention Collapses: How Degenerate Layers in LLMs Enable Smaller, Stronger Models
☆126Mar 6, 2026Updated 4 months ago
gabrielcassimiro17 / async-langchain
View on GitHub
Demonstration of how to run multiple chains in Langchain Assyncronously
☆12Jul 6, 2023Updated 3 years ago
HarryMayne / qwen_3_chat_templates
View on GitHub
Alternative chat templates for Qwen 3 8B. Useful for multi-turn RL
☆15Sep 4, 2025Updated 10 months ago
RononDex / Astrobot
View on GitHub
An advanced discord bot running on .net core, that can help with anything that is astronomy related (platesolving, image analysis, astron…
☆14Jul 13, 2026Updated last week