☆119May 2, 2024Updated last year
Alternatives and similar repositories for opinions_qa
Users that are interested in opinions_qa are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Lightweight Adapting for Black-Box Large Language Models☆25Feb 15, 2024Updated 2 years ago
- Code for the paper "CoS: Enhancing Personalization and Mitigating Bias with Context Steering"☆20Dec 13, 2024Updated last year
- Code for the paper "Data Feedback Loops: Model-driven Amplification of Dataset Biases"☆18Sep 9, 2022Updated 3 years ago
- Data for "Datamodels: Predicting Predictions with Training Data"☆97May 25, 2023Updated 2 years ago
- ☆11Jul 7, 2023Updated 2 years ago
- Code and data for the paper "Understanding Hidden Context in Preference Learning: Consequences for RLHF"☆27Aug 21, 2024Updated last year
- Röttger et al. (2024): "IssueBench: Millions of Realistic Prompts for Measuring Issue Bias in LLM Writing Assistance"☆16Mar 6, 2026Updated 2 weeks ago
- Multi-agent Social Simulation + Efficient, Effective, and Stable alternative of RLHF. Code for the paper "Training Socially Aligned Langu…☆355Jun 18, 2023Updated 2 years ago
- [EMNLP 2024] "Revisiting Who's Harry Potter: Towards Targeted Unlearning from a Causal Intervention Perspective"☆33Jul 22, 2024Updated last year
- ☆23Mar 8, 2024Updated 2 years ago
- Code and data used in the paper: "Training on Incorrect Synthetic Data via RL Scales LLM Math Reasoning Eight-Fold"☆32Jun 16, 2024Updated last year
- Data and models for Misinfo Reaction Frames paper.☆14Jun 9, 2024Updated last year
- YesBut - Multimodal Satire Comprehension Dataset☆18Oct 23, 2024Updated last year
- NLPBench: Evaluating NLP-Related Problem-solving Ability in Large Language Models☆10Oct 27, 2023Updated 2 years ago
- Data and code for APPDIA: A Discourse-aware Transformer-based Style Transfer Model for Offensive Social Media Conversations (COLING 2022)…☆13Sep 8, 2022Updated 3 years ago
- [NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering☆77Jan 16, 2026Updated 2 months ago
- Source-to-Source Debuggable Derivatives in Pure Python☆15Jan 23, 2024Updated 2 years ago
- [ICLR 2025] "Training LMs on Synthetic Edit Sequences Improves Code Synthesis" (Piterbarg, Pinto, Fergus)☆19Feb 11, 2025Updated last year
- Code for experiments on self-prediction as a way to measure introspection in LLMs☆16Dec 10, 2024Updated last year
- Emotion-Aware Dialogue Response Generation by Multi-Task Learning☆13Jan 22, 2022Updated 4 years ago
- Code for "Goodtriever: Toxicity Mitigation with Retrieval-augmented Language Models"☆25May 30, 2024Updated last year
- Code for the paper "The Journey, Not the Destination: How Data Guides Diffusion Models"☆25Dec 12, 2023Updated 2 years ago
- ☆15Oct 24, 2022Updated 3 years ago
- UniGraph: Learning a Unified Cross-Domain Foundation Model for Text-Attributed Graphs (KDD'25)☆26Jun 6, 2025Updated 9 months ago
- [NeurIPS 2023] Model-enhanced Vector Index☆26May 9, 2024Updated last year
- Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging☆118Oct 23, 2023Updated 2 years ago
- DialOp: Decision-oriented dialogue environments for collaborative language agents☆111Nov 15, 2024Updated last year
- Tools for understanding how transformer predictions are built layer-by-layer☆576Aug 7, 2025Updated 7 months ago
- Röttger et al. (ACL 2021): "HateCheck: Functional Tests for Hate Speech Detection Models" - Data☆59Oct 14, 2025Updated 5 months ago
- ☆12May 18, 2022Updated 3 years ago
- ☆19Jun 21, 2025Updated 9 months ago
- Python standalone tokenizer☆15Nov 12, 2015Updated 10 years ago
- ☆18Oct 12, 2022Updated 3 years ago
- My Implementation for the paper EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks using Tensor…☆12Mar 18, 2022Updated 4 years ago
- [ICLR 2025] Official Repository for "Tamper-Resistant Safeguards for Open-Weight LLMs"☆67Jun 9, 2025Updated 9 months ago
- Reinforcement Learning via Regressing Relative Rewards☆40Dec 12, 2024Updated last year
- ☆11Jul 21, 2022Updated 3 years ago
- ☆32Mar 13, 2025Updated last year
- Python library for argument and configuration management☆56Feb 7, 2023Updated 3 years ago