This is the official Python version of CoreInfer: Accelerating Large Language Model Inference with Semantics-Inspired Adaptive Sparse Activation.
β17Oct 25, 2024Updated last year
Alternatives and similar repositories for CoreInfer
Users that are interested in CoreInfer are comparing it to the libraries listed below
Sorting:
- HippoMM: Hippocampal-inspired Multimodal Memoryβ15May 22, 2025Updated 9 months ago
- [ICML 2025] Official Repo for Stability-guided Adaptive Diffusion Acceleration. ππAccelerating off-the-shelf diffusion model with a uniβ¦β39Jul 24, 2025Updated 7 months ago
- [ICML'24 Oral] APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inferenceβ47Jun 4, 2024Updated last year
- [ICLR'25 Spotlight] Min-K%++: Improved baseline for detecting pre-training data of LLMsβ52May 26, 2025Updated 9 months ago
- Official Implementation of SAM-Decoding: Speculative Decoding via Suffix Automatonβ40Feb 13, 2025Updated last year
- POSTECH: Compiler Construction (Spring 2022)β10Mar 10, 2023Updated 2 years ago
- Implement some method of LLM KV Cache Sparsityβ41Jun 6, 2024Updated last year
- Repository for the DPP'23 courseβ11May 2, 2024Updated last year
- Generic library for neural collapse and several derivative works on the phenomenon.β18Apr 14, 2025Updated 10 months ago
- β15Jan 27, 2026Updated last month
- [NeurIPS'25 Spotlight] Adaptive Attention Sparsity with Hierarchical Top-p Pruningβ87Nov 29, 2025Updated 3 months ago
- [ICLR 2025] TidalDecode: A Fast and Accurate LLM Decoding with Position Persistent Sparse Attentionβ52Aug 6, 2025Updated 6 months ago
- The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":β44Feb 18, 2026Updated last week
- Finetuning LLaMA with DeepSpeedβ10Apr 14, 2023Updated 2 years ago
- Pytorch code of [CVPR 2023] "NAR-Former: Neural Architecture Representation Learning towards Holistic Attributes Prediction".β11Mar 14, 2023Updated 2 years ago
- Cross-Self KV Cache Pruning for Efficient Vision-Language Inferenceβ10Dec 15, 2024Updated last year
- Code for the paper "Knowledge-Aware Federated Active Learning with Non-IID Data", ICCV2023β10Sep 8, 2023Updated 2 years ago
- β118Nov 4, 2025Updated 3 months ago
- [ACL 2024 Findings] Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruningβ13Sep 2, 2024Updated last year
- Source code of FedAttack.β11Feb 9, 2022Updated 4 years ago
- β11Sep 20, 2024Updated last year
- β13May 11, 2023Updated 2 years ago
- Website for CSE 234, Winter 2025β13Mar 24, 2025Updated 11 months ago
- Code for paper: Unraveling the Shift of Visual Information Flow in MLLMs: From Phased Interaction to Efficient Inferenceβ13Jun 7, 2025Updated 8 months ago
- Master thesis - reproducing state-of-the-art schema matching algorithmsβ14Jul 6, 2023Updated 2 years ago
- An Efficient Supply Chain Management System using Blockchain & Machine Learning.β10Nov 27, 2019Updated 6 years ago
- β17May 2, 2024Updated last year
- Tensorflow implementation of TrialAttack (Triple Adversarial Learning for Influence based Poisoning Attack in Recommender Systems. KDD 20β¦β12Sep 2, 2021Updated 4 years ago
- β10May 30, 2020Updated 5 years ago
- Hal Daume's hbcβ20Jan 23, 2010Updated 16 years ago
- Smoothing video traffic to make it a friendlier internet neighborβ14Apr 23, 2024Updated last year
- [ICLR 2025] SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Accelerationβ62Feb 21, 2025Updated last year
- An Attention Superoptimizerβ22Jan 20, 2025Updated last year
- A Collection of Parallel Algorithms for Computational Geometryβ12Mar 10, 2022Updated 3 years ago
- μΌκ°νμ μ€μ ! Tritonβ16Feb 15, 2024Updated 2 years ago
- [CVPR'24] Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compressionβ15Jul 1, 2024Updated last year
- β18Mar 11, 2025Updated 11 months ago
- Satellite images classificationβ15Nov 30, 2019Updated 6 years ago
- Advanced implementation of DeepSeek-R1 featuring Group Relative Policy Optimization (GRPO) for mathematical reasoning AI. Integrates safeβ¦β13Jan 29, 2025Updated last year