Algorithms for approximate attention in LLMs
☆21Apr 14, 2025Updated 10 months ago
Alternatives and similar repositories for loki
Users that are interested in loki are comparing it to the libraries listed below
Sorting:
- Learning to Skip the Middle Layers of Transformers☆17Aug 7, 2025Updated 6 months ago
- Distributed Deep Learning Tutorial☆16Nov 17, 2025Updated 3 months ago
- An official repository for GPTailor☆17Jun 29, 2025Updated 8 months ago
- Damselfly Network Simulator☆10Nov 19, 2020Updated 5 years ago
- [ICML'25] "Rethinking Addressing in Language Models via Contextualized Equivariant Positional Encoding" by Jiajun Zhu, Peihao Wang, Ruisi…☆14Jun 6, 2025Updated 8 months ago
- Gemstones: A Model Suite for Multi-Faceted Scaling Laws (NeurIPS 2025)☆33Sep 28, 2025Updated 5 months ago
- ☆27Jun 5, 2025Updated 8 months ago
- A suite of communication proxies for HPC applications☆13Jul 7, 2023Updated 2 years ago
- Set-Encoder: Permutation-Invariant Inter-Passage Attention for Listwise Passage Re-Ranking with Cross-Encoders☆18May 23, 2025Updated 9 months ago
- CS194-196 Course Project☆14Feb 20, 2025Updated last year
- ☆18Oct 12, 2022Updated 3 years ago
- Training vision models with full-batch gradient descent and regularization☆39Feb 14, 2023Updated 3 years ago
- ☆33Jul 9, 2025Updated 7 months ago
- The official implementation of Preference Data Reward-Augmentation.☆18May 1, 2025Updated 10 months ago
- Detecting Hallucinations in Large Language Model Generation: A Token Probability Approach. This repository includes the implementation of…☆17Jun 1, 2024Updated last year
- ☆16Jul 23, 2024Updated last year
- [NeurIPS 2024] | An Efficient Recipe for Long Context Extension via Middle-Focused Positional Encoding☆22Oct 10, 2024Updated last year
- ☆20Nov 4, 2025Updated 3 months ago
- the code of GRFormer: Grouped Residual Self-Attention for Lightweight Single Image Super-Resolution☆26May 16, 2024Updated last year
- The open-source materials for paper "Sparsing Law: Towards Large Language Models with Greater Activation Sparsity".☆30Nov 12, 2024Updated last year
- ☆20Oct 25, 2022Updated 3 years ago
- Trace Replay and Network Simulation Framework☆21Apr 14, 2021Updated 4 years ago
- Analyze parallel execution traces using pandas dataframes☆25Oct 22, 2025Updated 4 months ago
- ☆29Nov 9, 2025Updated 3 months ago
- Resa: Transparent Reasoning Models via SAEs☆47Sep 23, 2025Updated 5 months ago
- ☆25Apr 10, 2025Updated 10 months ago
- Pytorch Datasets for Easy-To-Hard☆29Jan 9, 2025Updated last year
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆12Jun 28, 2025Updated 8 months ago
- A Text2SQL benchmark for evaluation of Large Language Models☆41Updated this week
- Parallel framework for training and fine-tuning deep neural networks☆71Nov 10, 2025Updated 3 months ago
- Source code of paper ''KVSharer: Efficient Inference via Layer-Wise Dissimilar KV Cache Sharing''☆31Oct 24, 2024Updated last year
- KV Cache Steering for Inducing Reasoning in Small Language Models☆46Jul 24, 2025Updated 7 months ago
- The simplest implementation of recent Sparse Attention patterns for efficient LLM inference.☆91Jul 17, 2025Updated 7 months ago
- Bayes-Adaptive RL for LLM Reasoning☆45May 28, 2025Updated 9 months ago
- ☆18Jun 10, 2025Updated 8 months ago
- ☆28Sep 13, 2021Updated 4 years ago
- [NeurIPS ENLSP Workshop'24] CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios☆16Oct 18, 2024Updated last year
- Chrome extension that restores the Dim (dark blue) background theme on X/Twitter☆36Feb 19, 2026Updated last week
- [NeurIPS 2024] The official implementation of ZipCache: Accurate and Efficient KV Cache Quantization with Salient Token Identification☆32Mar 30, 2025Updated 11 months ago