Jyk-122/D-LLM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Jyk-122/D-LLM)

Jyk-122 / D-LLM

[NeurIPS 2024] Implementation of paper - D-LLM: A Token Adaptive Computing Resource Allocation Strategy for Large Language Models

☆24

Alternatives and similar repositories for D-LLM

Users that are interested in D-LLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

michaelfeil / candle-flash-attn-v3
View on GitHub
☆15Dec 21, 2025Updated 7 months ago
yifanlu0227 / LLaMA2-7B-on-laptop
View on GitHub
Lab 5 project of MIT-6.5940, deploying LLaMA2-7B-chat on one's laptop with TinyChatEngine.
☆18Dec 1, 2023Updated 2 years ago
machilusZ / FastGen
View on GitHub
This repo contains the source code for: Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs
☆44Aug 14, 2024Updated last year
dhcode-cpp / Engram-pytorch
View on GitHub
pytorch implementation of DeepSeek Engram
☆19Mar 24, 2026Updated 4 months ago
WHU-AISE / PBScaler
View on GitHub
PBScaler: A Bottleneck-aware Autoscaling Framework for Microservice-based Applications
☆28Dec 2, 2024Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
ranggihwang / Pregated_MoE
View on GitHub
☆63May 4, 2024Updated 2 years ago
rtenlab / gcaps-super-repo
View on GitHub
GCAPS: GPU Context-Aware Preemptive Scheduling Approach
☆16Mar 22, 2026Updated 4 months ago
Nota-NetsPresso / shortened-llm
View on GitHub
Compressed LLMs for Efficient Text Generation [ICLR'24 Workshop]
☆90Sep 13, 2024Updated last year
cmd2001 / KVTuner
View on GitHub
[ICML2025] KVTuner: Sensitivity-Aware Layer-wise Mixed Precision KV Cache Quantization for Efficient and Nearly Lossless LLM Inference
☆29Jan 27, 2026Updated 6 months ago
PKU-SEC-Lab / AdapMoE
View on GitHub
Code release for AdapMoE accepted by ICCAD 2024
☆39Apr 28, 2025Updated last year
HaoKang-Timmy / torchanalyse
View on GitHub
A pytorch model profiler with information about macs, energy and e.t.c
☆17Feb 24, 2024Updated 2 years ago
LeaperOvO / LUK
View on GitHub
☆15May 14, 2025Updated last year
Adaxry / Unified_Layer_Skipping
View on GitHub
☆15Apr 11, 2024Updated 2 years ago
multimodal-art-projection / TreePO
View on GitHub
☆65Mar 30, 2026Updated 3 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
EIT-NLP / SkipGPT
View on GitHub
[ICML 2025] Official implementation of the paper "SkipGPT: Dynamic Layer Pruning Reinvented with Token Awareness and Module Decoupling". …
☆21Nov 17, 2025Updated 8 months ago
shoaibahmed / llm_depth_pruning
View on GitHub
Official implementation of the paper: "A deeper look at depth pruning of LLMs"
☆15Jul 24, 2024Updated 2 years ago
pmem / pmem.github.io
View on GitHub
The pmem.io Website
☆17Jan 20, 2026Updated 6 months ago
MIT-REALM / dcrl
View on GitHub
Density Constrained Reinforcement Learning
☆12Mar 24, 2023Updated 3 years ago
runchu-tian / LongPiBench
View on GitHub
The repository for papaer "Distance between Relevant Information Pieces Causes Bias in Long-Context LLMs"
☆14Dec 16, 2024Updated last year
hegongshan / Storage-for-AI-Paper
View on GitHub
Accelerating AI Training and Inference from Storage Perspective (Must-read Papers on Storage for AI)
☆64Jun 22, 2026Updated last month
bojieli / SocksDirect
View on GitHub
SocksDirect code repository
☆20May 6, 2026Updated 2 months ago
apple / ml-compress-and-compare
View on GitHub
Interactively Evaluating Efficiency and Behavior Across ML Model Compression Experiments (VIS 2024)
☆27Jul 17, 2025Updated last year
FYYFU / HeadKV
View on GitHub
[ICLR2025] Code and data for paper: Not All Heads Matter: A Head-Level KV Cache Compression Method with Integrated Retrieval and Reasonin…
☆45Mar 10, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
gogoczh / CoMT
View on GitHub
code for "CoMT: A Novel Benchmark for Chain of Multi-modal Thought on Large Vision-Language Models"
☆19Mar 10, 2025Updated last year
Carrefour / carrefour-runtime
View on GitHub
Carrefour runtime. Uses harwdare counters to decide whether Carrefour needs to be run or not.
☆16Sep 29, 2015Updated 10 years ago
multifacet / Bypassd
View on GitHub
Bypassd is a novel I/O architecture that provides low latency access to shared SSDs.
☆23May 14, 2025Updated last year
jlwu002 / BCL
View on GitHub
[ICML 2022] Robust Deep Reinforcement Learning through Bootstrapped Opportunistic Curriculum
☆12Jul 15, 2022Updated 4 years ago
zhougroup / IDAC
View on GitHub
Implicit Distributional Actor Critic
☆11Dec 8, 2021Updated 4 years ago
YJHMITWEB / ExFlow
View on GitHub
Explore Inter-layer Expert Affinity in MoE Model Inference
☆16May 6, 2024Updated 2 years ago
p-quic / ubpf
View on GitHub
Implementation of the user-space eBPF VM based on the iovisor version (https://github.com/iovisor/ubpf)
☆13Apr 16, 2020Updated 6 years ago
dongwonjo / FastKV
View on GitHub
[ACL Findings 2026] Official Implementation of "FastKV: Decoupling of Context Reduction and KV Cache Compression for Prefill-Decoding Acc…
☆32Apr 14, 2026Updated 3 months ago
ZJUVAI / Freshman-Training
View on GitHub
A training program for freshmem
☆15Jul 29, 2021Updated 5 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
rzezeski / libMicro
View on GitHub
☆11May 26, 2020Updated 6 years ago
Carrefour / carrefour-module
View on GitHub
This module collects per-page stats and decide for each page if it should be migrated, replicated or interleaved.
☆17Sep 29, 2015Updated 10 years ago
nishadsinghi / sc-genrm-scaling
View on GitHub
[COLM 2025] Official code for "When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoni…
☆15Oct 31, 2025Updated 8 months ago
Infini-AI-Lab / Sirius
View on GitHub
Sirius, an efficient correction mechanism, which significantly boosts Contextual Sparsity models on reasoning tasks while maintaining its…
☆21Sep 10, 2024Updated last year
AoiDragon / Awesome-Text-Diffusion-Models
View on GitHub
[IJCAI'23] The official Github page of the paper "Diffusion Models for Non-autoregressive Text Generation: A Survey".
☆33Dec 21, 2023Updated 2 years ago
allenai / mosaic-leaderboard
View on GitHub
Leaderboard implementations for datasets produced by the Mosaic Team.
☆20Jul 6, 2023Updated 3 years ago
SusCom-Lab / ZSMerge
View on GitHub
☆23Sep 24, 2025Updated 10 months ago