antgroup / OmniKVView external linksLinks
Dynamic Context Selection for Efficient Long-Context LLMs
☆56May 20, 2025Updated 8 months ago
Alternatives and similar repositories for OmniKV
Users that are interested in OmniKV are comparing it to the libraries listed below
Sorting:
- ☆28Jul 20, 2020Updated 5 years ago
- Experimental Vega Dataflow Visualization☆21Jul 28, 2016Updated 9 years ago
- [NeurIPS 2025 Spotlight] A Token is Worth over 1,000 Tokens: Efficient Knowledge Distillation through Low-Rank Clone.☆45Oct 29, 2025Updated 3 months ago
- ☆14Apr 7, 2017Updated 8 years ago
- Code for the preprint "Cache Me If You Can: How Many KVs Do You Need for Effective Long-Context LMs?"☆48Jul 29, 2025Updated 6 months ago
- [ACL 2025] Squeezed Attention: Accelerating Long Prompt LLM Inference☆56Nov 20, 2024Updated last year
- ☆22Mar 7, 2025Updated 11 months ago
- Other than papers from big-name labs and universities, most AI research papers get less than 10 readers, even though there might be gems …☆15Jul 20, 2018Updated 7 years ago
- ☆19Sep 24, 2025Updated 4 months ago
- ☆49Nov 25, 2024Updated last year
- Getting Starting with NIMBUS-CORE☆10Dec 16, 2023Updated 2 years ago
- Dataset for Visually Indicated Sound Generation by Perceptually Optimized Classification☆22Apr 6, 2020Updated 5 years ago
- Distributed Bayesian Optimization☆23Jun 29, 2020Updated 5 years ago
- A simple but well-performing "single-hop" visual attention model for the GQA dataset☆20Aug 8, 2019Updated 6 years ago
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆13Jun 28, 2025Updated 7 months ago
- [ICML 2025] |TokenSwift: Lossless Acceleration of Ultra Long Sequence Generation☆121May 19, 2025Updated 8 months ago
- Website for TextVQA dataset.☆28Apr 30, 2023Updated 2 years ago
- Comparison of Variational Autoencoders with Bayesian Neural Networks. Accuracy, Latent space, Reconstruction and White Noise filtering.☆28Feb 16, 2018Updated 8 years ago
- [DAC'25] Official implement of "HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE Inference"☆101Dec 15, 2025Updated 2 months ago
- Supporting code for ReCEval paper☆31Sep 14, 2024Updated last year
- ☆83Oct 9, 2024Updated last year
- [SIGMOD 2025] PQCache: Product Quantization-based KVCache for Long Context LLM Inference☆82Dec 7, 2025Updated 2 months ago
- Code for CVPR'18 "Grounding Referring Expressions in Images by Variational Context"☆30Jul 4, 2018Updated 7 years ago
- commandline epub reader using python/curses☆29Aug 26, 2020Updated 5 years ago
- ☆13Jan 28, 2026Updated 2 weeks ago
- ☆151Oct 9, 2024Updated last year
- ☆33Oct 8, 2020Updated 5 years ago
- BISON: Binary Image SelectiON☆49Sep 15, 2021Updated 4 years ago
- An experimentation platform for LLM inference optimisation☆35Sep 19, 2024Updated last year
- ☆12Apr 14, 2025Updated 10 months ago
- [ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference☆372Jul 10, 2025Updated 7 months ago
- ☆43Dec 28, 2025Updated last month
- Command line tool for common video manipulations☆41Oct 25, 2023Updated 2 years ago
- ☆10Nov 17, 2022Updated 3 years ago
- ant design mobile components☆10Jul 26, 2021Updated 4 years ago
- Disable YubiKey output on MacOS without a modifier key pressed☆10Aug 10, 2022Updated 3 years ago
- Datacenter simulation toolkit for the OpenDC project☆10Aug 24, 2020Updated 5 years ago
- ECG analysis to classify anterior myocardial infarction cases.☆10May 17, 2017Updated 8 years ago
- Source code for SWIFT, an efficient reward model.☆18Jan 13, 2026Updated last month