JerryYin777 / PaperHelperLinks

PaperHelper: Knowledge-Based LLM QA Paper Reading Assistant with Reliable References

☆18

Alternatives and similar repositories for PaperHelper

Users that are interested in PaperHelper are comparing it to the libraries listed below

Sorting:

LLMkvsys / rethink-kv-compression
☆20Updated 8 months ago
ZhenweiAn / Dynamic_MoE
Inference Code for Paper "Harder Tasks Need More Experts: Dynamic Routing in MoE Models"
☆66Updated last year
thunlp / APB
Official Implementation of APB (ACL 2025 main Oral)
☆31Updated 9 months ago
JarvisPei / CMoE
Implementation for the paper: CMoE: Fast Carving of Mixture-of-Experts for Efficient LLM Inference
☆30Updated 8 months ago
THUDM / Awesome-Parameter-Efficient-Fine-Tuning-for-Foundation-Models
Parameter-Efficient Fine-Tuning for Foundation Models
☆99Updated 7 months ago
Chen-GX / C-3PO
[ICML2025] The official implementation of "C-3PO: Compact Plug-and-Play Proxy Optimization to Achieve Human-like Retrieval-Augmented Gene…
☆40Updated 6 months ago
zhaochenyang20 / Prompt2Model-Self-Guide
SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper
☆33Updated last year
yuelinan / Awesome-Efficient-R1-style-LRMs
☆45Updated 3 months ago
Infini-AI-Lab / gsm_infinite
☆55Updated 5 months ago
StarRing2022 / R1-Nature
最简易的R1结果在小模型上的复现，阐述类O1与DeepSeek R1最重要的本质。Think is all your need。利用实验佐证，对于强推理能力，think思考过程性内容是AGI/ASI的核心。
☆44Updated 9 months ago
HiAgent2024 / HiAgent
HiAgent: Hierarchical Working Memory Management for Solving Long-Horizon Agent Tasks with Large Language Model
☆30Updated 9 months ago
Fu-Dayuan / AgentRefine
(ICLR 2025) AgentRefine: Enhancing Agent Generalization through Refinement Tuning
☆18Updated 9 months ago
zhaochenyang20 / ModelServer
Efficient, Flexible, and Highly Fault-Tolerant Model Service Management Based on SGLang
☆61Updated last year
chunhuizhang / prompts_for_academic
☆61Updated 3 weeks ago
OpenSparseLLMs / Linear-MoE
☆120Updated 5 months ago
waltonfuture / Diff-eRank
[NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models
☆54Updated 5 months ago
SkyworkAI / Skywork-MoE
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models
☆137Updated last year
thomaschlt / mla.c
Implementation from scratch in C of the Multi-head latent attention used in the Deepseek-v3 technical paper.
☆19Updated 10 months ago
zenrran4nlp / Awesome-LLM-Inference-Serving
☆46Updated 6 months ago
vllm-project / vllm-nccl
Manages vllm-nccl dependency
☆17Updated last year
RUC-NLPIR / HiRA
The code for paper: Decoupled Planning and Execution: A Hierarchical Reasoning Framework for Deep Search
☆62Updated 4 months ago
taehokim20 / LLMem
LLMem: GPU Memory Estimation for Fine-Tuning Pre-Trained LLMs
☆26Updated 5 months ago
FreedomIntelligence / FastLLM
Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];
☆41Updated last year
yyht / openrlhf_async_pipline
☆86Updated 3 months ago
pangu-tech / pangu-ultra
☆73Updated 5 months ago
SqueezeAILab / SqueezedAttention
[ACL 2025] Squeezed Attention: Accelerating Long Prompt LLM Inference
☆54Updated last year
lliai / D2MoE
D^2-MoE: Delta Decompression for MoE-based LLMs Compression
☆69Updated 7 months ago
NuoJohnChen / JudgeLRM
JudgeLRM: Large Reasoning Models as a Judge
☆40Updated 2 months ago
tile-ai / tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
☆19Updated this week
pprp / Awesome-Efficient-MoE
Efficient Mixture of Experts for LLM Paper List
☆144Updated last month