MaxBelitsky / cache-steeringLinks
KV Cache Steering for Inducing Reasoning in Small Language Models
☆36Updated 2 weeks ago
Alternatives and similar repositories for cache-steering
Users that are interested in cache-steering are comparing it to the libraries listed below
Sorting:
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆60Updated 11 months ago
- ☆53Updated 9 months ago
- Leveraging Base Language Models for Few-Shot Synthetic Data Generation☆33Updated last week
- Verifiers for LLM Reinforcement Learning☆69Updated 3 months ago
- Lottery Ticket Adaptation☆39Updated 8 months ago
- The first dense retrieval model that can be prompted like an LM☆82Updated 3 months ago
- ☆49Updated 5 months ago
- Improving Text Embedding of Language Models Using Contrastive Fine-tuning☆64Updated last year
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆36Updated 2 weeks ago
- ReBase: Training Task Experts through Retrieval Based Distillation☆29Updated 6 months ago
- Codebase accompanying the Summary of a Haystack paper.☆79Updated 10 months ago
- A repository for research on medium sized language models.☆78Updated last year
- ☆56Updated 3 months ago
- Synthetic data generation and benchmark implementation for "Episodic Memories Generation and Evaluation Benchmark for Large Language Mode…☆49Updated 3 months ago
- Source code for the collaborative reasoner research project at Meta FAIR.☆99Updated 3 months ago
- ☆73Updated 3 weeks ago
- ☆57Updated 10 months ago
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated last year
- MEXMA: Token-level objectives improve sentence representations☆41Updated 7 months ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆55Updated 6 months ago
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆59Updated 8 months ago
- Aioli: A unified optimization framework for language model data mixing☆27Updated 6 months ago
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆81Updated last week
- ☆75Updated 3 months ago
- ☆48Updated 11 months ago
- Code and pretrained models for the paper: "MatMamba: A Matryoshka State Space Model"☆60Updated 8 months ago
- Code for PHATGOOSE introduced in "Learning to Route Among Specialized Experts for Zero-Shot Generalization"☆86Updated last year
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated last year
- Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)☆34Updated 5 months ago
- Learning to Retrieve by Trying - Source code for Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval☆49Updated 9 months ago