Model souping for LLMs
☆72Nov 18, 2025Updated 4 months ago
Alternatives and similar repositories for llm_souping
Users that are interested in llm_souping are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A pytorch implementation of the ICCV2021 workshop paper SimDis: Simple Distillation Baselines for Improving Small Self-supervised Models☆14Jul 15, 2021Updated 4 years ago
- Official repository for the paper Local Linear Attention: An Optimal Interpolation of Linear and Softmax Attention For Test-Time Regressi…☆23Oct 1, 2025Updated 5 months ago
- SGLang Kernel Wheel Index☆17Updated this week
- An extention to the GaLore paper, to perform Natural Gradient Descent in low rank subspace☆18Oct 21, 2024Updated last year
- MLX Implementation of Recursive Reasoning with Tiny Networks☆79Oct 11, 2025Updated 5 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Standalone repo for our Atropos integration with Thinking Machines Tinker API (https://thinkingmachines.ai/tinker/)☆20Updated this week
- [EMNLP 2024] RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization☆38Sep 24, 2024Updated last year
- [ICCAD 2025] Squant☆15Jul 3, 2025Updated 8 months ago
- Official repository for Activation-Informed Merging (AIM) of Large Language Models☆22Feb 10, 2025Updated last year
- This repository contains code for the MicroAdam paper.☆21Dec 14, 2024Updated last year
- LEMMA: Logical Engine for Multi-domain Mathematical Analysis☆28Feb 14, 2026Updated last month
- Rethinking the Trust Region in LLM Reinforcement Learning☆50Mar 2, 2026Updated 3 weeks ago
- AIRS-Bench: an AI Research Science benchmark for quantifying the end-to-end AI research abilities of LLM agents☆67Mar 17, 2026Updated last week
- Code for our paper "Learning to Generate Unit Tests for Automated Debugging"☆17Mar 7, 2025Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Fast and memory-efficient exact attention☆30Dec 2, 2024Updated last year
- REAP expert pruning for MoE LLMs on Apple Silicon via MLX☆49Mar 16, 2026Updated last week
- ☆32Nov 18, 2025Updated 4 months ago
- Official PyTorch implementation for Revisiting LRP: Positional Attribution as the Missing Ingredient for Transformer Explainability [Neur…☆15Jul 7, 2025Updated 8 months ago
- ☆13Jan 14, 2026Updated 2 months ago
- [ICML2025] Official Repo for Paper "Optimizing Temperature for Language Models with Multi-Sample Inference"☆22Feb 16, 2025Updated last year
- Official implementation of Categorical Flow Maps on text.☆48Feb 16, 2026Updated last month
- [ACL 2023] Solving Math Word Problems via Cooperative Reasoning induced Language Models (LLMs + MCTS + Self-Improvement)☆50Dec 15, 2023Updated 2 years ago
- [ICLR 2026] Thinking on the Fly: Test-Time Reasoning Enhancement via Latent Thought Policy Optimization☆24Mar 6, 2026Updated 2 weeks ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- PyTorch implementation of Data2Vec self-supervised approach for vision use cases.☆18Oct 7, 2022Updated 3 years ago
- [CVPR 2025] Official Implementation of LOCORE: Image Re-ranking with Long-Context☆15Apr 15, 2025Updated 11 months ago
- A lightweight graphics library for the Elm programming language☆15Jul 15, 2017Updated 8 years ago
- ☆41Feb 14, 2026Updated last month
- SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Model https://arxiv.org/pdf/2411.02433☆119Dec 5, 2024Updated last year
- Dataset for AAAI paper "Natural Language Inference in Context - Investigating Contextual Reasoning over Long Texts"☆11Nov 18, 2022Updated 3 years ago
- ☆93Oct 30, 2025Updated 4 months ago
- (Weekly Update) Python / Modern C++ Solutions of All 1643 LeetCode Problems☆13Nov 3, 2020Updated 5 years ago
- [ICCV 2025] Official Implementation of RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model for Referring …☆19Jun 27, 2025Updated 9 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆14Oct 11, 2023Updated 2 years ago
- ☆18Mar 12, 2019Updated 7 years ago
- SWE-Exp: Experience-Driven Software Issue Resolution☆38Oct 17, 2025Updated 5 months ago
- ☆17Aug 1, 2025Updated 7 months ago
- ☆15Dec 2, 2025Updated 3 months ago
- Official code for Guiding Language Model Math Reasoning with Planning Tokens☆19Feb 29, 2024Updated 2 years ago
- Code for the EACL 2024 paper: "Small Language Models Improve Giants by Rewriting Their Outputs"☆12Apr 20, 2024Updated last year