MIRALab-USTC / LLMReasoning-SpecSearchLinks
This is the source code of our ICML25 paper, titled "Accelerating Large Language Model Reasoning via Speculative Search".
☆21Updated 7 months ago
Alternatives and similar repositories for LLMReasoning-SpecSearch
Users that are interested in LLMReasoning-SpecSearch are comparing it to the libraries listed below
Sorting:
- This is the code for our ICLR 2025 paper, titled Computing Circuits Optimization via Model-Based Circuit Genetic Evolution.☆12Updated 7 months ago
- Github Repo for OATS: Outlier-Aware Pruning through Sparse and Low Rank Decomposition☆16Updated 8 months ago
- [WSDM'24 Oral] The official implementation of paper <DeSCo: Towards Generalizable and Scalable Deep Subgraph Counting>☆23Updated last year
- ☆33Updated 9 months ago
- Code Repository of Evaluating Quantized Large Language Models☆136Updated last year
- Curated collection of papers in MoE model inference☆329Updated 2 months ago
- Awesome list for LLM pruning.☆280Updated 3 months ago
- Repo for SpecEE: Accelerating Large Language Model Inference with Speculative Early Exiting (ISCA25)☆70Updated 8 months ago
- ☆26Updated last year
- ☆113Updated 2 years ago
- This repo contains the code for studying the interplay between quantization and sparsity methods☆26Updated 10 months ago
- An implementation of the DISP-LLM method from the NeurIPS 2024 paper: Dimension-Independent Structural Pruning for Large Language Models.☆23Updated 5 months ago
- Official Pytorch Implementation of "Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity"☆76Updated 6 months ago
- [ICML‘24] Official code for the paper "Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark ".☆122Updated 6 months ago
- ☆217Updated 2 months ago
- Awesome LLM pruning papers all-in-one repository with integrating all useful resources and insights.☆141Updated 5 months ago
- Code release for AdapMoE accepted by ICCAD 2024☆35Updated 8 months ago
- [NeurIPS 2024 Oral🔥] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.☆178Updated last year
- Awesome-LLM-KV-Cache: A curated list of 📙Awesome LLM KV Cache Papers with Codes.☆405Updated 10 months ago
- ☆17Updated last year
- Official code implementation for 2025 ICLR accepted paper "Dobi-SVD : Differentiable SVD for LLM Compression and Some New Perspectives"☆50Updated 2 months ago
- Tender: Accelerating Large Language Models via Tensor Decompostion and Runtime Requantization (ISCA'24)☆24Updated last year
- ☆52Updated last year
- Awesome list for LLM quantization☆378Updated 3 months ago
- This repository serves as a comprehensive survey of LLM development, featuring numerous research papers along with their corresponding co…☆264Updated last month
- ☆26Updated last year
- [ICLR2025]: OSTQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitt…☆87Updated 9 months ago
- some docs for rookies in nics-efc☆22Updated 3 years ago
- [DAC 2024] EDGE-LLM: Enabling Efficient Large Language Model Adaptation on Edge Devices via Layerwise Unified Compression and Adaptive La…☆76Updated last year
- ☆33Updated 3 weeks ago