BunsenFeng / model_swarmLinks

☆18

Alternatives and similar repositories for model_swarm

Users that are interested in model_swarm are comparing it to the libraries listed below

Sorting:

TamSiuhin / OPPU
Official Implementation of "Democratizing Large Language Models via Personalized Parameter-Efficient Fine-tuning" at EMNLP 2024 Main Conf…
☆32Updated 5 months ago
glorgao / SelectiveDPO
Principled Data Selection for Alignment: The Hidden Risks of Difficult Examples
☆38Updated 3 months ago
alecwangcq / f-divergence-dpo
Direct preference optimization with f-divergences.
☆14Updated 8 months ago
lyh6560new / P3Sum
The offical code for paper "What Constitutes a Faithful Summary? Preserving Author Perspectives in News Summarization"
☆10Updated last year
guyuntian / CoT_benchmark
Code for "Towards Revealing the Mystery behind Chain of Thought: a Theoretical Perspective"
☆20Updated 2 years ago
bethgelab / sober-reasoning
A Sober Look at Language Model Reasoning
☆77Updated last month
kanishkg / cognitive-behaviors
☆202Updated 3 months ago
Joshua-Ren / Learning_dynamics_LLM
☆147Updated 2 months ago
Arthur-Heng / NLGraph
Official repository of "Can Language Models Solve Graph Problems in Natural Language?". NeurIPS 2023 (Spotlight)
☆131Updated 10 months ago
EIT-NLP / Awesome-Latent-CoT
This repository contains a regularly updated paper list for LLMs-reasoning-in-latent-space.
☆134Updated this week
ZHZisZZ / modpo
[ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization
☆83Updated 10 months ago
YangRui2015 / RiC
Code for the ICML 2024 paper "Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment"
☆73Updated last month
zepingyu0512 / awesome-SAE
awesome SAE papers
☆39Updated last month
OSU-NLP-Group / Deductive-Beam-Search
[COLM'24] "Deductive Beam Search: Decoding Deducible Rationale for Chain-of-Thought Reasoning"
☆21Updated last year
jianghoucheng / AlphaEdit
AlphaEdit: Null-Space Constrained Knowledge Editing for Language Models, ICLR 2025 (Outstanding Paper)
☆282Updated last week
FeiSun / LaTeX-Drawing
LaTeX Drawing
☆13Updated last month
TsinghuaC3I / MARTI
A Framework for LLM-based Multi-Agent Reinforced Training and Inference
☆157Updated last month
Guangxuan-Xiao / GSM8K-eval
☆44Updated last year
PRIME-RL / Entropy-Mechanism-of-RL
The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.
☆251Updated this week
Edward-Sun / easy-to-hard
Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
☆123Updated 10 months ago
Alsace08 / Chain-of-Embedding
[ICLR 2025] Code and Data Repo for Paper "Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation"
☆68Updated 6 months ago
tmlr-group / landscape-of-thoughts
[ICLR 2025 Workshop] "Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models"
☆30Updated 2 weeks ago
abhishekpanigrahi1996 / Skill-Localization-by-grafting
☆49Updated last year
QingyangZhang / Label-Free-RLVR
☆242Updated last week
JLZhong23 / awesome-reward-models
☆85Updated last month
ShuoTang123 / MATRIX
Implementation of the MATRIX framework (ICML 2024)
☆56Updated last year
YuxiXie / MCTS-DPO
This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.
☆318Updated 11 months ago
ChenmienTan / malmen
☆34Updated last year
PRIME-RL / ImplicitPRM
Repo of paper "Free Process Rewards without Process Labels"
☆154Updated 4 months ago
wang2226 / Awesome-LLM-Decoding
📜 Paper list on decoding methods for LLMs and LVLMs
☆52Updated 2 weeks ago