Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)
☆29Mar 1, 2024Updated 2 years ago
Alternatives and similar repositories for LLM-self-play
Users that are interested in LLM-self-play are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Extended Wikilinks dataset description☆15Apr 1, 2018Updated 8 years ago
- The official implementation of Self-Play Fine-Tuning (SPIN)☆1,242May 8, 2024Updated 2 years ago
- ☆17Oct 19, 2021Updated 4 years ago
- Code and models for EMNLP 2024 paper "WPO: Enhancing RLHF with Weighted Preference Optimization"☆41Sep 24, 2024Updated last year
- Open-source Human Feedback Library☆11Oct 25, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- a fast implementation of BM25☆10Sep 15, 2022Updated 3 years ago
- Implementation of "Towards Understanding Mixture of Experts in Deep Learning", NeurIPS 2022☆10Jan 6, 2023Updated 3 years ago
- Reimplementation of https://github.com/montemac/algebraic_value_editing in pure PyTorch for efficiency on large models☆11Jun 28, 2023Updated 2 years ago
- Provably (and non-vacuously) bounding test error of deep neural networks under distribution shift with unlabeled test data.☆10Feb 27, 2024Updated 2 years ago
- Extensive Self-Contrast Enables Feedback-Free Language Model Alignment☆20Apr 2, 2024Updated 2 years ago
- Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI☆1,409Apr 11, 2024Updated 2 years ago
- ☆11May 28, 2024Updated 2 years ago
- This repository contains code for the paper Direct Preference Optimization with an Offset (ODPO).☆20Feb 17, 2025Updated last year
- lime-ner: extending LIME for Named Entity Recognition☆10Aug 15, 2018Updated 7 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- This repo is reproduction resources for linear alignment paper, still working☆17May 19, 2024Updated 2 years ago
- Lazy one's Flask application☆11Aug 13, 2016Updated 9 years ago
- ☆14May 25, 2023Updated 3 years ago
- Machine learning project using federated learning for text generation☆11May 5, 2024Updated 2 years ago
- A clean and easy implementation of MuZero, AlphaZero and Self-Play reinforcement learning algorithms for any game.☆17Oct 15, 2024Updated last year
- SCREWS: A Modular Framework for Reasoning with Revisions☆27Sep 26, 2023Updated 2 years ago
- [ICLR 2025] Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization☆12Jan 26, 2025Updated last year
- Low-Rank adapter extraction for fine-tuned transformers models☆181May 2, 2024Updated 2 years ago
- Fun project to run your own LLM chat bot using llama.cpp☆11Jun 9, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Minsk in VB☆11May 10, 2022Updated 4 years ago
- Code and data for paper named: Large language models for automatic equation discovery of nonlinear dynamics☆13Mar 6, 2025Updated last year
- ☆13Apr 17, 2024Updated 2 years ago
- The system enables sophisticated coordination of multiple drones through natural language commands, visual inputs, and real-time environm…☆17Dec 15, 2025Updated 6 months ago
- It checks how secure the program you made is and shows how vulnerable your program is.☆20Apr 20, 2017Updated 9 years ago
- Unofficial implementation of the Ask-LLM paper 'How to Train Data-Efficient LLMs', arXiv:2402.09668.☆12Jun 19, 2024Updated last year
- Google 공식 Rouge Implementation을 한국어에서 사용할 수 있도록 처리☆17Jan 3, 2024Updated 2 years ago
- ☆23Oct 30, 2023Updated 2 years ago
- Code for 'Prototypical Representation Learning for Relation Extraction'.☆32May 10, 2021Updated 5 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Rust FTL + WebRTC live streaming software.☆13Mar 12, 2022Updated 4 years ago
- Automata Theory. Building a RegExp machine☆12May 10, 2019Updated 7 years ago
- Official code for "From Seeing to Doing: Bridging Reasoning and Decision for Robotic Manipulation" (ICLR2026)☆36Mar 1, 2026Updated 3 months ago
- Open-source project for converting the Bible into JSON for native languages. A collaborative platform for digitizing sacred texts, and ma…☆10May 14, 2024Updated 2 years ago
- ☆15Feb 17, 2025Updated last year
- This is a comprehensive guide on how you can automate your feature engineering process.☆11Jun 25, 2018Updated 7 years ago
- ☆12Sep 1, 2023Updated 2 years ago