eth-sri / language-model-arithmetic
Controlled Text Generation via Language Model Arithmetic
☆211Updated last month
Related projects ⓘ
Alternatives and complementary repositories for language-model-arithmetic
- Self-playing Adversarial Language Game Enhances LLM Reasoning, NeurIPS 2024☆96Updated this week
- Code and data for "Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs"☆447Updated 7 months ago
- [ICLR 2023] Codebase for Copy-Generator model, including an implementation of kNN-LM☆181Updated last year
- Layer-Condensed KV cache w/ 10 times larger batch size, fewer params and less computation. Dramatic speed up with better task performance…☆137Updated this week
- A simple unified framework for evaluating LLMs☆138Updated this week
- This is work done by the Oxen.ai Community, trying to reproduce the Self-Rewarding Language Model paper from MetaAI.☆109Updated 6 months ago
- Mass-editing thousands of facts into a transformer memory (ICLR 2023)☆434Updated 9 months ago
- ☆294Updated 5 months ago
- ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward exp…☆216Updated 7 months ago
- Code for In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering☆143Updated 3 weeks ago
- visualizing attention for LLM users☆162Updated last year
- a curated list of data for reasoning ai☆111Updated 3 months ago
- Experiments on speculative sampling with Llama models☆117Updated last year
- Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'☆160Updated last month
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)☆200Updated 5 months ago
- Evaluating LLMs with fewer examples☆134Updated 7 months ago
- The code for the paper: "Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models"☆49Updated 4 months ago
- Benchmarking LLMs with Challenging Tasks from Real Users☆194Updated last week
- ☆126Updated last year
- ☆102Updated last month
- ☆124Updated 6 months ago
- DSIR large-scale data selection framework for language model training☆227Updated 7 months ago
- LongEmbed: Extending Embedding Models for Long Context Retrieval (EMNLP 2024)☆114Updated this week
- Code repository for the c-BTM paper☆105Updated last year
- [NeurIPS'24] SelfCodeAlign: Self-Alignment for Code Generation☆261Updated last week
- TART: A plug-and-play Transformer module for task-agnostic reasoning☆190Updated last year
- [EMNLP 2023] The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning☆212Updated last year
- [EMNLP 2023] Adapting Language Models to Compress Long Contexts☆276Updated 2 months ago
- Improving Alignment and Robustness with Circuit Breakers☆152Updated last month
- Experiments for efforts to train a new and improved t5☆76Updated 6 months ago