This is the official Python version of CoreInfer: Accelerating Large Language Model Inference with Semantics-Inspired Adaptive Sparse Activation.
β17Oct 25, 2024Updated last year
Alternatives and similar repositories for CoreInfer
Users that are interested in CoreInfer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- HippoMM: Hippocampal-inspired Multimodal Memoryβ22May 22, 2025Updated last year
- [ICML 2025] Official Repo for Stability-guided Adaptive Diffusion Acceleration. ππAccelerating off-the-shelf diffusion model with a uniβ¦β43Jul 24, 2025Updated 9 months ago
- "Knock, knock!" "Who's there?" "Dobi."β17Aug 11, 2025Updated 9 months ago
- [ICLR 2025] Dobi-SVD : Differentiable SVD for LLM Compression and Some New Perspectives"β53Oct 19, 2025Updated 7 months ago
- [ICLR'25 Spotlight] Min-K%++: Improved baseline for detecting pre-training data of LLMsβ56May 26, 2025Updated 11 months ago
- Proton VPN Special Offer - Get 70% off β’ AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- [ICML'24 Oral] APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inferenceβ48Jun 4, 2024Updated last year
- Advanced implementation of DeepSeek-R1 featuring Group Relative Policy Optimization (GRPO) for mathematical reasoning AI. Integrates safeβ¦β13Jan 29, 2025Updated last year
- [NeurIPS'25] KVCOMM: Online Cross-context KV-cache Communication for Efficient LLM-based Multi-agent Systemsβ16Nov 1, 2025Updated 6 months ago
- The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":β46Feb 28, 2026Updated 2 months ago
- An implementation of 5-stages RISC-V CPUβ12Jul 22, 2022Updated 3 years ago
- Master thesis - reproducing state-of-the-art schema matching algorithmsβ14Jul 6, 2023Updated 2 years ago
- Code for paper: Unraveling the Shift of Visual Information Flow in MLLMs: From Phased Interaction to Efficient Inferenceβ14Jun 7, 2025Updated 11 months ago
- [ICLR 2025] TidalDecode: A Fast and Accurate LLM Decoding with Position Persistent Sparse Attentionβ53Aug 6, 2025Updated 9 months ago
- Pytorch implementation of our paper accepted by ICML 2023 -- "Bi-directional Masks for Efficient N:M Sparse Training"β13Jun 7, 2023Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Finetuning LLaMA with DeepSpeedβ10Apr 14, 2023Updated 3 years ago
- [ACL 2024 Findings] Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruningβ13Sep 2, 2024Updated last year
- [NeurIPS'25 Spotlight] Adaptive Attention Sparsity with Hierarchical Top-p Pruningβ100Apr 20, 2026Updated last month
- β15Jan 27, 2026Updated 3 months ago
- [NeurIPS'24] Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategyβ73Jan 22, 2025Updated last year
- β118Nov 4, 2025Updated 6 months ago
- β10May 30, 2020Updated 5 years ago
- [CVPR'24] Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compressionβ15Jul 1, 2024Updated last year
- Cross-Self KV Cache Pruning for Efficient Vision-Language Inferenceβ10Dec 15, 2024Updated last year
- Managed Database hosting by DigitalOcean β’ AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Final project of Object-Oriented-Programming: STL allocator + memory poolβ10Jun 22, 2019Updated 6 years ago
- An Efficient Supply Chain Management System using Blockchain & Machine Learning.β10Nov 27, 2019Updated 6 years ago
- This is our Computer Graphics course project in ZJUβ13Apr 14, 2020Updated 6 years ago
- β17May 2, 2024Updated 2 years ago
- β21Oct 23, 2024Updated last year
- Rookie's guideβ12Aug 10, 2024Updated last year
- LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Modelsβ167Mar 8, 2026Updated 2 months ago
- CCKS2023-PromptCBLUE: Code implement of TianChi completitionβ20Feb 27, 2024Updated 2 years ago
- The open-source materials for paper "Sparsing Law: Towards Large Language Models with Greater Activation Sparsity".β30Nov 12, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Tensorflow implementation of TrialAttack (Triple Adversarial Learning for Influence based Poisoning Attack in Recommender Systems. KDD 20β¦β12Sep 2, 2021Updated 4 years ago
- Pytorch code of [CVPR 2023] "NAR-Former: Neural Architecture Representation Learning towards Holistic Attributes Prediction".β11Mar 14, 2023Updated 3 years ago
- Satellite images classificationβ15Nov 30, 2019Updated 6 years ago
- Source code of FedAttack.β10Feb 9, 2022Updated 4 years ago
- Parallel Self-Adjusting Computationβ16Jul 5, 2021Updated 4 years ago
- [ICLR 2025] Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language Modelsβ74Mar 29, 2025Updated last year
- β16Feb 17, 2019Updated 7 years ago