Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)
☆29Mar 1, 2024Updated 2 years ago
Alternatives and similar repositories for LLM-self-play
Users that are interested in LLM-self-play are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta☆13Nov 11, 2024Updated last year
- The official implementation of Self-Play Fine-Tuning (SPIN)☆1,240May 8, 2024Updated 2 years ago
- Code and models for EMNLP 2024 paper "WPO: Enhancing RLHF with Weighted Preference Optimization"☆41Sep 24, 2024Updated last year
- Control LLM generation format efficiently. A simple version of microsoft/aici in vllm and transformers☆14Jun 7, 2024Updated last year
- Open-source Human Feedback Library☆11Oct 25, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- a fast implementation of BM25☆10Sep 15, 2022Updated 3 years ago
- Implementation of "Towards Understanding Mixture of Experts in Deep Learning", NeurIPS 2022☆10Jan 6, 2023Updated 3 years ago
- Provably (and non-vacuously) bounding test error of deep neural networks under distribution shift with unlabeled test data.☆10Feb 27, 2024Updated 2 years ago
- Extensive Self-Contrast Enables Feedback-Free Language Model Alignment☆20Apr 2, 2024Updated 2 years ago
- Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI☆1,409Apr 11, 2024Updated 2 years ago
- ☆11May 28, 2024Updated last year
- lime-ner: extending LIME for Named Entity Recognition☆10Aug 15, 2018Updated 7 years ago
- This repo is reproduction resources for linear alignment paper, still working☆18May 19, 2024Updated 2 years ago
- Lazy one's Flask application☆11Aug 13, 2016Updated 9 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- ☆20Dec 14, 2024Updated last year
- ☆14May 25, 2023Updated 3 years ago
- Machine learning project using federated learning for text generation☆11May 5, 2024Updated 2 years ago
- A clean and easy implementation of MuZero, AlphaZero and Self-Play reinforcement learning algorithms for any game.☆17Oct 15, 2024Updated last year
- The official implementation of Preference Data Reward-Augmentation.☆18May 1, 2025Updated last year
- SCREWS: A Modular Framework for Reasoning with Revisions☆27Sep 26, 2023Updated 2 years ago
- [ICLR 2025] Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization☆12Jan 26, 2025Updated last year
- ☆11Jan 6, 2024Updated 2 years ago
- ☆19Jun 10, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Unofficial Implementation of Evolutionary Model Merging☆42Mar 28, 2024Updated 2 years ago
- some mixture of experts architecture implementations☆27Mar 22, 2024Updated 2 years ago
- Federated Learning - PyTorch☆15Jun 27, 2021Updated 4 years ago
- Fun project to run your own LLM chat bot using llama.cpp☆11Jun 9, 2023Updated 2 years ago
- Minsk in VB☆11May 10, 2022Updated 4 years ago
- ☆11Jul 30, 2024Updated last year
- Code and data for paper named: Large language models for automatic equation discovery of nonlinear dynamics☆13Mar 6, 2025Updated last year
- ☆13Apr 17, 2024Updated 2 years ago
- The system enables sophisticated coordination of multiple drones through natural language commands, visual inputs, and real-time environm…☆17Dec 15, 2025Updated 5 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- It checks how secure the program you made is and shows how vulnerable your program is.☆20Apr 20, 2017Updated 9 years ago
- Unofficial implementation of the Ask-LLM paper 'How to Train Data-Efficient LLMs', arXiv:2402.09668.☆12Jun 19, 2024Updated last year
- Google 공식 Rouge Implementation을 한국어에서 사용할 수 있도록 처리☆17Jan 3, 2024Updated 2 years ago
- forza-telemetry-kafka-producer☆10May 2, 2022Updated 4 years ago
- ☆23Oct 30, 2023Updated 2 years ago
- We conduct a preregistered experiment to investigate whether fact checks provided by a large language model can serve as an effective mis…☆13Dec 14, 2024Updated last year
- Automata Theory. Building a RegExp machine☆12May 10, 2019Updated 7 years ago