☆36Feb 12, 2025Updated last year
Alternatives and similar repositories for APE
Users that are interested in APE are comparing it to the libraries listed below
Sorting:
- ☆11Dec 20, 2024Updated last year
- The code for "AttentionPredictor: Temporal Pattern Matters for Efficient LLM Inference", Qingyue Yang, Jie Wang, Xing Li, Zhihai Wang, Ch…☆28Jul 15, 2025Updated 8 months ago
- A Recipe for Building LLM Reasoners to Solve Complex Instructions☆31Oct 9, 2025Updated 5 months ago
- Official Implementation of APB (ACL 2025 main Oral) and Spava.☆35Jan 30, 2026Updated last month
- Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding.☆13Nov 19, 2024Updated last year
- [ICLR 2026] Official PyTorch implementation for "ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding"☆61Dec 26, 2025Updated 2 months ago
- ☆14Jan 20, 2025Updated last year
- ☆23Jan 27, 2014Updated 12 years ago
- ☆22Oct 25, 2024Updated last year
- An efficient implementation of the NSA (Native Sparse Attention) kernel☆131Jun 24, 2025Updated 8 months ago
- ☆20Jun 1, 2025Updated 9 months ago
- ☆10Aug 25, 2025Updated 6 months ago
- ☆33Oct 13, 2025Updated 5 months ago
- Offcial Repo of Paper "Eliminating Position Bias of Language Models: A Mechanistic Approach""☆21Jun 13, 2025Updated 9 months ago
- [ICLR 2026] BARREL: Boundary-Aware Reasoning for Factual and Reliable LRMs☆17May 21, 2025Updated 10 months ago
- [ICLR2025 Spotlight] MagicPIG: LSH Sampling for Efficient LLM Generation☆251Dec 16, 2024Updated last year
- ☆11Aug 13, 2024Updated last year
- ☆14Apr 14, 2025Updated 11 months ago
- ☆13May 9, 2023Updated 2 years ago
- Mirror of YSmart☆14May 20, 2013Updated 12 years ago
- ☆13Aug 1, 2025Updated 7 months ago
- 🔥 LLM-powered GPU kernel synthesis: Train models to convert PyTorch ops into optimized Triton kernels via SFT+RL. Multi-turn compilation…☆132Nov 10, 2025Updated 4 months ago
- Layer-Condensed KV cache w/ 10 times larger batch size, fewer params and less computation. Dramatic speed up with better task performance…☆157Apr 7, 2025Updated 11 months ago
- ☆12Apr 25, 2025Updated 10 months ago
- ☆15Aug 15, 2025Updated 7 months ago
- The code for "VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by VIdeo SpatioTemporal Augmentation" [CVPR2025]☆21Feb 27, 2025Updated last year
- ☆34Oct 9, 2025Updated 5 months ago
- Test equality between a black-box LLM API and a reference distribution☆12Oct 29, 2024Updated last year
- ☆56May 19, 2025Updated 10 months ago
- GBDT-based model with efficient unlearning (SIGMOD 2023)☆10Sep 7, 2025Updated 6 months ago
- MSVBASE is a system that efficiently supports complex queries of both approximate similarity search and relational operators. It integrat…☆103Nov 19, 2024Updated last year
- ☆15Jul 24, 2022Updated 3 years ago
- ☆12Sep 1, 2023Updated 2 years ago
- Reading seminar in Harvard Cloud Networking and Systems Group☆16Aug 29, 2022Updated 3 years ago
- Corresponding code to "Improving Robustness of ML Classifiers against Realizable Evasion Attacks Using Conserved Features" @ USENIX Secur…☆11Aug 5, 2019Updated 6 years ago
- TEMPURA enables video-language models to reason about causal event relationships and generate fine-grained, timestamped descriptions of u…☆25Jun 4, 2025Updated 9 months ago
- ☆29Mar 24, 2025Updated 11 months ago
- VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs☆55Mar 9, 2025Updated last year
- A source-to-source compiler for optimizing CUDA dynamic parallelism by aggregating launches☆15Jun 21, 2019Updated 6 years ago