Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)
☆29Mar 1, 2024Updated 2 years ago
Alternatives and similar repositories for LLM-self-play
Users that are interested in LLM-self-play are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta☆13Nov 11, 2024Updated last year
- Extended Wikilinks dataset description☆15Apr 1, 2018Updated 8 years ago
- The official implementation of Self-Play Fine-Tuning (SPIN)☆1,237May 8, 2024Updated last year
- ☆17Oct 19, 2021Updated 4 years ago
- Code and models for EMNLP 2024 paper "WPO: Enhancing RLHF with Weighted Preference Optimization"☆41Sep 24, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- An Easy Annotation Tool for Natural Language Processing☆11May 17, 2024Updated last year
- Open-source Human Feedback Library☆11Oct 25, 2023Updated 2 years ago
- Implementation of "Towards Understanding Mixture of Experts in Deep Learning", NeurIPS 2022☆10Jan 6, 2023Updated 3 years ago
- Reimplementation of https://github.com/montemac/algebraic_value_editing in pure PyTorch for efficiency on large models☆11Jun 28, 2023Updated 2 years ago
- Provably (and non-vacuously) bounding test error of deep neural networks under distribution shift with unlabeled test data.☆10Feb 27, 2024Updated 2 years ago
- Extensive Self-Contrast Enables Feedback-Free Language Model Alignment☆20Apr 2, 2024Updated 2 years ago
- Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI☆1,407Apr 11, 2024Updated 2 years ago
- ☆11May 28, 2024Updated last year
- This repo is reproduction resources for linear alignment paper, still working☆18May 19, 2024Updated last year
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆20Dec 14, 2024Updated last year
- ☆13May 25, 2023Updated 2 years ago
- Machine learning project using federated learning for text generation☆11May 5, 2024Updated last year
- The official implementation of Preference Data Reward-Augmentation.☆18May 1, 2025Updated 11 months ago
- SCREWS: A Modular Framework for Reasoning with Revisions☆27Sep 26, 2023Updated 2 years ago
- [ICLR 2025] Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization☆12Jan 26, 2025Updated last year
- ☆19Jun 10, 2024Updated last year
- Unofficial Implementation of Evolutionary Model Merging☆41Mar 28, 2024Updated 2 years ago
- some mixture of experts architecture implementations☆27Mar 22, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Low-Rank adapter extraction for fine-tuned transformers models☆181May 2, 2024Updated last year
- Federated Learning - PyTorch☆15Jun 27, 2021Updated 4 years ago
- In this model I have created a basic AI chatbot Interface with External plugin abilities; with visual basic An Interface AI_Contracts en…☆10May 2, 2021Updated 4 years ago
- Fun project to run your own LLM chat bot using llama.cpp☆11Jun 9, 2023Updated 2 years ago
- 统计学中文核心期刊知识图谱构建;NEO4J+LLM应用实现;统计学问答语料数据构建及LLM的Lora微调☆14Nov 22, 2024Updated last year
- Dockerfile for AmigaOS Cross-Compiler Toolchain☆11Mar 12, 2018Updated 8 years ago
- ☆13Apr 17, 2024Updated last year
- The system enables sophisticated coordination of multiple drones through natural language commands, visual inputs, and real-time environm…☆16Dec 15, 2025Updated 4 months ago
- Unofficial implementation of the Ask-LLM paper 'How to Train Data-Efficient LLMs', arXiv:2402.09668.☆12Jun 19, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Google 공식 Rouge Implementation을 한국어에서 사용할 수 있도록 처리☆18Jan 3, 2024Updated 2 years ago
- forza-telemetry-kafka-producer☆10May 2, 2022Updated 3 years ago
- ☆23Oct 30, 2023Updated 2 years ago
- We conduct a preregistered experiment to investigate whether fact checks provided by a large language model can serve as an effective mis…☆13Dec 14, 2024Updated last year
- Automata Theory. Building a RegExp machine☆12May 10, 2019Updated 6 years ago
- ☆13May 18, 2024Updated last year
- Code for the paper "Deep FTRL-ORW: An Efficient Deep Reinforcement Learning Algorithm for Solving Imperfect Information Extensive-Form Ga…☆11Dec 1, 2022Updated 3 years ago