BunsenFeng / model_swarmLinks
☆16Updated 6 months ago
Alternatives and similar repositories for model_swarm
Users that are interested in model_swarm are comparing it to the libraries listed below
Sorting:
- Official Implementation of "Democratizing Large Language Models via Personalized Parameter-Efficient Fine-tuning" at EMNLP 2024 Main Conf…☆29Updated 4 months ago
- [ICLR 2025 Workshop] "Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models"☆19Updated last week
- The offical code for paper "What Constitutes a Faithful Summary? Preserving Author Perspectives in News Summarization"☆10Updated 11 months ago
- Direct preference optimization with f-divergences.☆13Updated 7 months ago
- [ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization☆79Updated 9 months ago
- Principled Data Selection for Alignment: The Hidden Risks of Difficult Examples☆36Updated last month
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆120Updated 8 months ago
- ☆19Updated last year
- [ICML 2024] Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications☆79Updated 2 months ago
- Code for the ICML 2024 paper "Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment"☆69Updated 5 months ago
- [NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$☆45Updated 7 months ago
- Code for "Towards Revealing the Mystery behind Chain of Thought: a Theoretical Perspective"☆20Updated last year
- GenRM-CoT: Data release for verification rationales☆61Updated 7 months ago
- Official code for SEAL: Steerable Reasoning Calibration of Large Language Models for Free☆25Updated 2 months ago
- [EMNLP 2023] MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions☆111Updated 8 months ago
- [ACL'24, Outstanding Paper] Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!☆36Updated 10 months ago
- A Framework for LLM-based Multi-Agent Reinforced Training and Inference☆89Updated last week
- ☆49Updated last year
- Awesome-Efficient-Inference-for-LRMs is a collection of state-of-the-art, novel, exciting, token-efficient methods for Large Reasoning Mo…☆65Updated this week
- Official code for paper Understanding the Reasoning Ability of Language Models From the Perspective of Reasoning Paths Aggregation☆20Updated last year
- ☆13Updated last year
- [COLM'24] "Deductive Beam Search: Decoding Deducible Rationale for Chain-of-Thought Reasoning"☆21Updated 11 months ago
- LaTeX Drawing☆11Updated 2 years ago
- A Sober Look at Language Model Reasoning☆63Updated last week
- Repo of paper "Free Process Rewards without Process Labels"☆149Updated 2 months ago
- The repository of the project "Fine-tuning Large Language Models with Sequential Instructions", code base comes from open-instruct and LA…☆29Updated 6 months ago
- A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.☆72Updated 2 months ago
- [ICLR 2025] Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization☆28Updated 4 months ago
- Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering☆59Updated 6 months ago
- ☆16Updated 7 months ago