☆18Mar 11, 2025Updated 11 months ago
Alternatives and similar repositories for TokenSelect
Users that are interested in TokenSelect are comparing it to the libraries listed below
Sorting:
- ☆14Jun 4, 2024Updated last year
- Adaptation of titans-pytorch to llama models on HF☆26Mar 6, 2025Updated 11 months ago
- Xmixers: A collection of SOTA efficient token/channel mixers☆28Sep 4, 2025Updated 5 months ago
- This is the official repo of "QuickLLaMA: Query-aware Inference Acceleration for Large Language Models"☆55Jul 16, 2024Updated last year
- 这是一整套完整的西南交大教务网自动查询成绩和自动通过邮件来通知新成绩的系统,可以和小伙伴们一起用☆21Sep 6, 2020Updated 5 years ago
- Official Code Repository for the paper "Key-value memory in the brain"☆31Feb 25, 2025Updated last year
- [SIGMOD 2025] PQCache: Product Quantization-based KVCache for Long Context LLM Inference☆82Dec 7, 2025Updated 2 months ago
- HiAgent: Hierarchical Working Memory Management for Solving Long-Horizon Agent Tasks with Large Language Model☆44Feb 16, 2025Updated last year
- An experimentation platform for LLM inference optimisation☆36Sep 19, 2024Updated last year
- A Multi-Session and Multi-Therapy Benchmark for High-Realism AI Psychological Counselor☆29Jan 13, 2026Updated last month
- TSDG: An efficient index graph for graph-based nearest neighbor search☆10Jul 14, 2022Updated 3 years ago
- A course for Mao Yisheng College of SWJTU☆11Mar 28, 2020Updated 5 years ago
- [ACL 2025] Squeezed Attention: Accelerating Long Prompt LLM Inference☆57Nov 20, 2024Updated last year
- InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management (OSDI'24)☆174Jul 10, 2024Updated last year
- [NeurIPS'25 Spotlight] Adaptive Attention Sparsity with Hierarchical Top-p Pruning☆87Nov 29, 2025Updated 3 months ago
- The code of our paper "InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Mem…☆395Apr 20, 2024Updated last year
- ☆52Jul 18, 2024Updated last year
- Clustered Compositional Embeddings☆11Oct 25, 2023Updated 2 years ago
- trending repositories and news related to AI☆10Mar 22, 2019Updated 6 years ago
- SGEMM and DGEMM subroutines using AVX512F instructions.☆15May 22, 2022Updated 3 years ago
- Official PyTorch implementation of CD-MOE☆12Mar 29, 2025Updated 11 months ago
- This repo contains the official code release of the Neural Experts paper, published in NeurIPS 2024.☆13Dec 3, 2024Updated last year
- Residual vector quantization for KV cache compression in large language model☆11Oct 22, 2024Updated last year
- ☆10Apr 9, 2021Updated 4 years ago
- Reference implementation of models from Nyonic Model Factory☆12May 13, 2024Updated last year
- A helper package to get information of scholarly articles from DBLP using its public API☆15May 13, 2025Updated 9 months ago
- Controlled Online Optimization Learning (COOL): Finding the Ground State of Spin Hamiltonians with Reinforcement Learning (arXiv:2003.000…☆13Jun 18, 2020Updated 5 years ago
- [COLM 2025: 1st Workshop on the Application of LLM Explainability to Reasoning and Planning] Latent Chain-of-Thought? Decoding the Depth-…☆17Oct 4, 2025Updated 4 months ago
- EXL2 quantization generalized to other models.☆10Mar 17, 2024Updated last year
- [ICLR 2025 SynthData Workshop Spotlight] Empowering LLMs in Decision Games through Algorithmic Data Synthesis☆26Apr 27, 2025Updated 10 months ago
- Benchmarking Deepseek R1 API response speeds across different providers for performance comparison.☆10Feb 15, 2025Updated last year
- ☆10Jun 5, 2018Updated 7 years ago
- ☆13Mar 27, 2019Updated 6 years ago
- ☆17Feb 3, 2026Updated 3 weeks ago
- Repository for Skill Set Optimization☆14Jul 26, 2024Updated last year
- Code accompanying the paper "A contrastive rule for meta-learning"☆13Oct 31, 2024Updated last year
- Toolkit to help you do better research☆11Apr 19, 2019Updated 6 years ago
- 台大Coursera 机器学习基石 林轩田☆15Nov 23, 2018Updated 7 years ago
- Our EMNLP 2022 paper on VIP-Based Prompting for Parameter-Efficient Learning☆10Oct 22, 2022Updated 3 years ago