☆26May 30, 2023Updated 2 years ago
Alternatives and similar repositories for reward_collapse
Users that are interested in reward_collapse are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- HyPe: Better Pre-trained Language Model Fine-tuning with Hidden Representation Perturbation [ACL 2023]☆14Jul 11, 2023Updated 2 years ago
- Domain-specific preference (DSP) data and customized RM fine-tuning.☆25Mar 7, 2024Updated 2 years ago
- ☆15Jul 9, 2025Updated 8 months ago
- Code for Unsupervised multi-granular Chinese word segmentation and term discovery via graph partition [JBI]☆16Jan 28, 2022Updated 4 years ago
- Code and data for "An Accurate Unsupervised Method for Joint Entity Alignment and Dangling Entity Detection".☆15Mar 26, 2022Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Efficient Symptom Inquiring and Diagnosis via Adaptive Alignment of Reinforcement Learning and Classification [AI in Medicine Journal]☆12May 20, 2022Updated 3 years ago
- Minimum Description Length probing for neural network representations☆20Jan 28, 2025Updated last year
- ML Benchmarks in Algebraic Combinatorics☆25Jan 15, 2026Updated 2 months ago
- Code for the AAAI 2020 oral paper - Dynamic Embedding on Textual Networks via a Gaussian Process.☆12Mar 26, 2020Updated 6 years ago
- ☆18Mar 18, 2024Updated 2 years ago
- BioBART: Pretraining and Evaluation of A Biomedical Generative Language Model [ACL-BioNLP 2022]☆52Oct 26, 2022Updated 3 years ago
- Code for ACL2024 paper - Adversarial Preference Optimization (APO).☆56Jun 3, 2024Updated last year
- Code and dataset for EMNLP 2022 Findings paper "Benchmarking Language Models for Code Syntax Understanding"☆16Oct 24, 2022Updated 3 years ago
- 🤖ConvRe🤯: An Investigation of LLMs’ Inefficacy in Understanding Converse Relations (EMNLP 2023)☆24Oct 10, 2023Updated 2 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- [ICLR 2024] Evaluating Large Language Models at Evaluating Instruction Following☆137Jul 8, 2024Updated last year
- ☆50Mar 14, 2024Updated 2 years ago
- Official repo for ACL 2023 paper Code4Struct: Code Generation for Few-Shot Structured Prediction from Natural Language.☆43Jan 7, 2024Updated 2 years ago
- ☆13Jul 2, 2025Updated 8 months ago
- Conic10K: A large-scale dataset for closed-vocabulary math problem understanding. Accepted to EMNLP2023 Findings.☆31Dec 6, 2023Updated 2 years ago
- ☆77Apr 29, 2024Updated last year
- Code and data for "Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation" (EMNLP 2023)☆64Nov 30, 2023Updated 2 years ago
- ☆30Jun 19, 2023Updated 2 years ago
- A large-scale, fine-grained, diverse preference dataset (and models).☆364Dec 29, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆25Jun 10, 2025Updated 9 months ago
- Calculating Expected Time for training LLM.☆38Apr 17, 2023Updated 2 years ago
- LLMPerf is a library for validating and benchmarking LLMs☆11Aug 13, 2024Updated last year
- Instruct-tuning LLaMA on consumer hardware with machine-translated data☆19Apr 17, 2023Updated 2 years ago
- Let's Sample Step by Step: Adaptive-Consistency for Efficient Reasoning with LLMs☆40Jan 30, 2024Updated 2 years ago
- Advantage Leftover Lunch Reinforcement Learning (A-LoL RL): Improving Language Models with Advantage-based Offline Policy Gradients☆27Sep 10, 2024Updated last year
- Debian packaging for NNCP [archived], moved to https://salsa.debian.org/go-team/packages/nncp☆14Feb 18, 2023Updated 3 years ago
- [AAAI 2025] Augmenting Math Word Problems via Iterative Question Composing (https://arxiv.org/abs/2401.09003)☆23Oct 2, 2025Updated 5 months ago
- An offical implementation of EHRDiff [TMLR]☆33Jun 25, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Official Code Repository for the paper "KALA: Knowledge-Augmented Language Model Adaptation" (NAACL 2022)☆35Oct 17, 2023Updated 2 years ago
- Open sourced predictions, execution logs, trajectories, and results from model inference + evaluation runs on the SWE-bench task.☆15Sep 4, 2024Updated last year
- Official code from the paper "Offline RL for Natural Language Generation with Implicit Language Q Learning"☆211Jul 31, 2023Updated 2 years ago
- Official PyTorch implementation of "CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning" @ ICCV 2023☆39Oct 16, 2025Updated 5 months ago
- [EMNLP 2024] Holistic Automated Red Teaming for Large Language Models through Top-Down Test Case Generation and Multi-turn Interaction☆17Nov 9, 2024Updated last year
- Multi-hop Evidence Retrieval for Cross-document Relation Extraction☆11Sep 1, 2023Updated 2 years ago
- ☆42Mar 26, 2025Updated last year