BunsenFeng / model_swarm
☆14Updated 5 months ago
Alternatives and similar repositories for model_swarm
Users that are interested in model_swarm are comparing it to the libraries listed below
Sorting:
- Official Implementation of "Democratizing Large Language Models via Personalized Parameter-Efficient Fine-tuning" at EMNLP 2024 Main Conf…☆28Updated 3 months ago
- ☆49Updated last year
- [ICLR 2025 Workshop] "Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models"☆18Updated 2 weeks ago
- Principled Data Selection for Alignment: The Hidden Risks of Difficult Examples☆33Updated last month
- [EMNLP 2023] MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions☆110Updated 8 months ago
- [ICML 2024] Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications☆76Updated last month
- Official code for SEAL: Steerable Reasoning Calibration of Large Language Models for Free☆22Updated last month
- [ICLR 2025] Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization☆25Updated 3 months ago
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆120Updated 8 months ago
- awesome SAE papers☆27Updated 2 months ago
- This is the official code for the paper "Safety Tax: Safety Alignment Makes Your Large Reasoning Models Less Reasonable".☆16Updated 2 months ago
- ☆19Updated last year
- A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.☆72Updated 2 months ago
- ☆61Updated last month
- code for EMNLP 2024 paper: Neuron-Level Knowledge Attribution in Large Language Models☆32Updated 6 months ago
- [ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style☆40Updated last month
- ☆23Updated 5 months ago
- Implementation of the MATRIX framework (ICML 2024)☆51Updated last year
- Official code for "Decoding-Time Language Model Alignment with Multiple Objectives".☆22Updated 6 months ago
- [NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering☆57Updated 5 months ago
- [COLM'24] "Deductive Beam Search: Decoding Deducible Rationale for Chain-of-Thought Reasoning"☆21Updated 11 months ago
- Code for "A Sober Look at Progress in Language Model Reasoning" paper☆45Updated this week
- Official code for paper Understanding the Reasoning Ability of Language Models From the Perspective of Reasoning Paths Aggregation☆20Updated last year
- ☆106Updated last week
- ☆13Updated last year
- ☆24Updated last month
- Direct preference optimization with f-divergences.☆13Updated 6 months ago
- Code for the ICML 2024 paper "Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment"☆69Updated 4 months ago
- GenRM-CoT: Data release for verification rationales☆59Updated 7 months ago
- Accepted LLM Papers in NeurIPS 2024☆37Updated 7 months ago