JerryYin777 / PaperHelper
PaperHelper: Knowledge-Based LLM QA Paper Reading Assistant with Reliable References
☆14Updated 9 months ago
Alternatives and similar repositories for PaperHelper:
Users that are interested in PaperHelper are comparing it to the libraries listed below
- Code for paper: "Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines"☆11Updated 5 months ago
- Implementation for the paper: CMoE: Fast Carving of Mixture-of-Experts for Efficient LLM Inference☆16Updated 3 weeks ago
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models☆43Updated 4 months ago
- Efficient Mixture of Experts for LLM Paper List☆47Updated 3 months ago
- ☆34Updated 3 weeks ago
- Self Reproduction Code of Paper "Reducing Transformer Key-Value Cache Size with Cross-Layer Attention (MIT CSAIL)☆12Updated 10 months ago
- survery of small language models☆14Updated 8 months ago
- ☆36Updated last month
- ☆70Updated 2 weeks ago
- Pretrain、decay、SFT a CodeLLM from scratch 🧙♂️☆35Updated 10 months ago
- The official implementation of "LevelRAG: Enhancing Retrieval-Augmented Generation with Multi-hop Logic Planning over Rewriting Augmented…☆23Updated last month
- Official Implementation for EMNLP 2024 (main) "AgentReview: Exploring Academic Peer Review with LLM Agent."☆49Updated 4 months ago
- Course materials for MIT6.5940: TinyML and Efficient Deep Learning Computing☆34Updated 2 months ago
- [ICLR 2025] Benchmarking Agentic Workflow Generation☆63Updated last month
- ☆25Updated last month
- ZO2 (Zeroth-Order Offloading): Full Parameter Fine-Tuning 175B LLMs with 18GB GPU Memory☆59Updated last week
- ☆14Updated last year
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆30Updated 9 months ago
- ☆29Updated 4 months ago
- Inference Code for Paper "Harder Tasks Need More Experts: Dynamic Routing in MoE Models"☆41Updated 7 months ago
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆36Updated last year
- SQUEEZED ATTENTION: Accelerating Long Prompt LLM Inference☆45Updated 4 months ago
- Pytorch implementation of our paper accepted by ICML 2024 -- CaM: Cache Merging for Memory-efficient LLMs Inference☆35Updated 9 months ago
- diagnosis_zero, R1 Zero reproduce on disease diagnosis☆25Updated last month
- Official implementation for "ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization"☆42Updated last month
- Parameter-Efficient Fine-Tuning for Foundation Models☆47Updated last month
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆18Updated last month
- [ACL 2024] RelayAttention for Efficient Large Language Model Serving with Long System Prompts☆39Updated last year
- SCOPE: Optimizing KV Cache Compression in Long-context Generation☆23Updated 3 months ago
- Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)☆52Updated last month