Infini-AI-Lab/APE

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Infini-AI-Lab/APE)

Infini-AI-Lab / APE

☆37

Alternatives and similar repositories for APE

Users that are interested in APE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

LighT-chenml / GPHash
View on GitHub
☆14Dec 20, 2024Updated last year
Infini-AI-Lab / vortex_torch
View on GitHub
Vortex: Programmable Sparse Attention for Agents as Algorithm Designers
☆68Jun 24, 2026Updated 3 weeks ago
thunlp / APB
View on GitHub
Official Implementation of APB (ACL 2025 main Oral) and Spava (ACL 2026 main).
☆37Apr 6, 2026Updated 3 months ago
TsinghuaC3I / FS-GEN
View on GitHub
Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding.
☆13Nov 19, 2024Updated last year
thu-nics / PM-KVQ
View on GitHub
The official code implementation for paper "PM-KVQ: Progressive Mixed-precision KV Cache Quantization for Long-CoT LLMs"
☆29May 24, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
SqueezeAILab / SqueezedAttention
View on GitHub
[ACL 2025] Squeezed Attention: Accelerating Long Prompt LLM Inference
☆58Nov 20, 2024Updated last year
imagination-research / LCSC
View on GitHub
[ICLR 2025] Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better
☆16Feb 15, 2025Updated last year
gaurav16gupta / constrainedANN
View on GitHub
☆14Jan 20, 2025Updated last year
cudbg / sqltutor
View on GitHub
☆12Oct 5, 2022Updated 3 years ago
yliu-cs / PiTe
View on GitHub
[ECCV'24 Oral] PiTe: Pixel-Temporal Alignment for Large Video-Language Model
☆17Feb 13, 2025Updated last year
shinington / facesec
View on GitHub
Corresponding code to "FACESEC: A Fine-grained Robustness Evaluation Framework for Face Recognition Systems" @ CVPR 2021
☆13Jun 22, 2021Updated 5 years ago
Infini-AI-Lab / MagicPIG
View on GitHub
[ICLR2025 Spotlight] MagicPIG: LSH Sampling for Efficient LLM Generation
☆255Dec 16, 2024Updated last year
MIRALab-USTC / LLM-AttentionPredictor
View on GitHub
The code for "AttentionPredictor: Temporal Pattern Matters for Efficient LLM Inference", Qingyue Yang, Jie Wang, Xing Li, Zhihai Wang, Ch…
☆29Jul 15, 2025Updated last year
amodaresi / MemLLM
View on GitHub
☆13Aug 13, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ulab-uiuc / diagram-eval
View on GitHub
[EMNLP 2025] DiagramEval: Evaluating LLM-Generated Diagrams via Graphs
☆17Nov 1, 2025Updated 8 months ago
YSmart / YSmart
View on GitHub
Mirror of YSmart
☆14May 20, 2013Updated 13 years ago
SNU-ARC / WALTZ
View on GitHub
☆13May 9, 2023Updated 3 years ago
DavidFanzz / llm_decoding
View on GitHub
☆12Apr 25, 2025Updated last year
snu-mllab / KVzip
View on GitHub
[NeurIPS'25 Oral] Query-agnostic KV cache eviction: 3–4× reduction in memory and 2× decrease in latency (Qwen3/2.5, Gemma3, LLaMA3)
☆224Feb 11, 2026Updated 5 months ago
brightlaboratory / polydl
View on GitHub
☆11Jun 29, 2021Updated 5 years ago
lsds / MultiKernelBOSS
View on GitHub
☆15Aug 15, 2025Updated 11 months ago
Owen718 / LongPrompt-LLamaGen
View on GitHub
This repository provides an improved LLamaGen Model, fine-tuned on 500,000 high-quality images, each accompanied by over 300 token prompt…
☆30Oct 21, 2024Updated last year
atlarge-research / Performance-Characterization-Storage-Stacks
View on GitHub
Zebin Ren and Animesh Trivedi. 2023. Performance Characterization of Modern Storage Stacks: POSIX I/O, libaio, SPDK, and io_uring. In Pro…
☆15Mar 30, 2023Updated 3 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
thu-coai / BARREL
View on GitHub
[ICLR 2026] BARREL: Boundary-Aware Reasoning for Factual and Reliable LRMs
☆18May 21, 2025Updated last year
yhzhou-pds / ATC23-Calcspar
View on GitHub
☆15Jan 10, 2024Updated 2 years ago
Xtra-Computing / DeltaBoost
View on GitHub
GBDT-based model with efficient unlearning (SIGMOD 2023)
☆10Sep 7, 2025Updated 10 months ago
kkkevinkkkkk / situated_faithfulness
View on GitHub
☆14Oct 17, 2024Updated last year
whyNLP / LCKV
View on GitHub
Layer-Condensed KV cache w/ 10 times larger batch size, fewer params and less computation. Dramatic speed up with better task performance…
☆157Apr 7, 2025Updated last year
adslabcuhk / elect
View on GitHub
☆13Aug 1, 2025Updated 11 months ago
tilde-research / nsa-release
View on GitHub
An efficient implementation of the NSA (Native Sparse Attention) kernel
☆133Jun 24, 2025Updated last year
Thinklab-SJTU / WSGNN
View on GitHub
Official PyTorch implementation for the following KDD2022 paper: Variational Inference for Training Graph Neural Networks in Low-Data Re…
☆20Oct 20, 2022Updated 3 years ago
uwdb / vss
View on GitHub
VSS: A Storage System for Video Analytics
☆13Jul 9, 2021Updated 5 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
MadryLab / bias-transfer
View on GitHub
☆15Jul 24, 2022Updated 3 years ago
harvard-cns / Harvard-CNS-Seminar
View on GitHub
Reading seminar in Harvard Cloud Networking and Systems Group
☆16Aug 29, 2022Updated 3 years ago
TIGER-AI-Lab / VISTA
View on GitHub
The code for "VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by VIdeo SpatioTemporal Augmentation" [CVPR2025]
☆20Feb 27, 2025Updated last year
Dominic789654 / LongGenBench
View on GitHub
Source code for the paper "LongGenBench: Long-context Generation Benchmark"
☆24Oct 8, 2024Updated last year
chenyaofo / CCA-Attention
View on GitHub
☆20Aug 14, 2025Updated 11 months ago
Aiden0526 / Aristotle
View on GitHub
Code and Data for ACL 2025 Paper "Aristotle: Mastering Logical Reasoning with A Logic-Complete Decompose-Search-Resolve Framework".
☆28Oct 3, 2025Updated 9 months ago
illinois-impact / klap
View on GitHub
A source-to-source compiler for optimizing CUDA dynamic parallelism by aggregating launches
☆15Jun 21, 2019Updated 7 years ago