Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)
☆29Mar 1, 2024Updated 2 years ago
Alternatives and similar repositories for LLM-self-play
Users that are interested in LLM-self-play are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta☆13Nov 11, 2024Updated last year
- The official implementation of Self-Play Fine-Tuning (SPIN)☆1,234May 8, 2024Updated last year
- An Easy Annotation Tool for Natural Language Processing☆11May 17, 2024Updated last year
- Control LLM generation format efficiently. A simple version of microsoft/aici in vllm and transformers☆14Jun 7, 2024Updated last year
- Open-source Human Feedback Library☆11Oct 25, 2023Updated 2 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Provably (and non-vacuously) bounding test error of deep neural networks under distribution shift with unlabeled test data.☆10Feb 27, 2024Updated 2 years ago
- Extensive Self-Contrast Enables Feedback-Free Language Model Alignment☆20Apr 2, 2024Updated last year
- lime-ner: extending LIME for Named Entity Recognition☆10Aug 15, 2018Updated 7 years ago
- Lazy one's Flask application☆11Aug 13, 2016Updated 9 years ago
- ☆20Dec 14, 2024Updated last year
- Machine learning project using federated learning for text generation☆11May 5, 2024Updated last year
- T-GD: Transferable GAN-generated Images Detection Framework. (ICML 2020)☆18May 12, 2021Updated 4 years ago
- A clean and easy implementation of MuZero, AlphaZero and Self-Play reinforcement learning algorithms for any game.☆17Oct 15, 2024Updated last year
- SCREWS: A Modular Framework for Reasoning with Revisions☆27Sep 26, 2023Updated 2 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- [ICLR 2025] Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization☆12Jan 26, 2025Updated last year
- ☆19Jun 10, 2024Updated last year
- Low-Rank adapter extraction for fine-tuned transformers models☆181May 2, 2024Updated last year
- ☆12Apr 17, 2024Updated last year
- Fun project to run your own LLM chat bot using llama.cpp☆11Jun 9, 2023Updated 2 years ago
- Minsk in VB☆11May 10, 2022Updated 3 years ago
- 统计学中文核心期刊知识图谱构建;NEO4J+LLM应用实现;统计学问答语料数据构建及LLM的Lora微调☆14Nov 22, 2024Updated last year
- Dockerfile for AmigaOS Cross-Compiler Toolchain☆11Mar 12, 2018Updated 8 years ago
- The system enables sophisticated coordination of multiple drones through natural language commands, visual inputs, and real-time environm…☆16Dec 15, 2025Updated 3 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- It checks how secure the program you made is and shows how vulnerable your program is.☆20Apr 20, 2017Updated 8 years ago
- Unofficial implementation of the Ask-LLM paper 'How to Train Data-Efficient LLMs', arXiv:2402.09668.☆12Jun 19, 2024Updated last year
- Google 공식 Rouge Implementation을 한국어에서 사용할 수 있도록 처리☆18Jan 3, 2024Updated 2 years ago
- We conduct a preregistered experiment to investigate whether fact checks provided by a large language model can serve as an effective mis…☆13Dec 14, 2024Updated last year
- forza-telemetry-kafka-producer☆10May 2, 2022Updated 3 years ago
- Automata Theory. Building a RegExp machine☆12May 10, 2019Updated 6 years ago
- S2APLER: S2 Agglomeration of Papers with Low Error Rate (it's for academic paper clustering)☆21Nov 4, 2025Updated 4 months ago
- ☆17Feb 6, 2025Updated last year
- This is a comprehensive guide on how you can automate your feature engineering process.☆11Jun 25, 2018Updated 7 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- This repository provides the python implementation for the paper "Decentralized Multi-Agent Formation Control via Deep Reinforcement Lear…☆20Jan 19, 2022Updated 4 years ago
- Building REPLs for Fun and Profit☆13Mar 29, 2018Updated 7 years ago
- [TMLR 2025] When Attention Collapses: How Degenerate Layers in LLMs Enable Smaller, Stronger Models☆125Mar 6, 2026Updated 2 weeks ago
- Training a reward model for RLHF using RWKV.☆15Jun 5, 2023Updated 2 years ago
- Demonstration of how to run multiple chains in Langchain Assyncronously☆12Jul 6, 2023Updated 2 years ago
- ☆17Oct 12, 2023Updated 2 years ago
- ☆19Nov 11, 2023Updated 2 years ago