socialfoundations / tttlmLinks
Test-time-training on nearest neighbors for large language models
☆49Updated last year
Alternatives and similar repositories for tttlm
Users that are interested in tttlm are comparing it to the libraries listed below
Sorting:
- A Sober Look at Language Model Reasoning☆92Updated 2 months ago
- Code for "Reasoning to Learn from Latent Thoughts"☆124Updated 10 months ago
- [NeurIPS 2024 Spotlight] Code and data for the paper "Finding Transformer Circuits with Edge Pruning".☆64Updated 5 months ago
- ☆51Updated 2 years ago
- Code accompanying the paper "Massive Activations in Large Language Models"☆195Updated last year
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆152Updated 6 months ago
- [NeurIPS'24 Spotlight] Observational Scaling Laws☆58Updated last year
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆124Updated last year
- ☆78Updated last year
- ☆53Updated 9 months ago
- ☆103Updated 2 years ago
- Function Vectors in Large Language Models (ICLR 2024)☆190Updated 9 months ago
- ☆34Updated 8 months ago
- Stanford NLP Python library for benchmarking the utility of LLM interpretability methods☆163Updated 7 months ago
- [NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*☆120Updated last year
- [COLM 2025] SEAL: Steerable Reasoning Calibration of Large Language Models for Free☆52Updated 9 months ago
- ☆51Updated 2 years ago
- ☆29Updated last year
- Official PyTorch Implementation of EMoE: Unlocking Emergent Modularity in Large Language Models [main conference @ NAACL2024]☆39Updated last year
- RL with Experience Replay☆54Updated 6 months ago
- ☆41Updated 2 years ago
- Code for ICLR 2025 Paper "What is Wrong with Perplexity for Long-context Language Modeling?"☆110Updated 3 months ago
- ☆51Updated 2 years ago
- ☆145Updated 4 months ago
- Code repository for "RL Grokking Recipe: How RL Unlocks and Transfers New Algorithms in LLMs""☆29Updated 3 months ago
- Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]☆147Updated last year
- Learning adapter weights from task descriptions☆19Updated 2 years ago
- [ACL 2023 Findings] Emergent Modularity in Pre-trained Transformers☆26Updated 2 years ago
- Reinforcing General Reasoning without Verifiers☆93Updated 7 months ago
- ☆18Updated last year