keli-wen / AGI-Study
The blog, read report and code example for AGI/LLM related knowledge.
☆28Updated last week
Alternatives and similar repositories for AGI-Study:
Users that are interested in AGI-Study are comparing it to the libraries listed below
- ☆71Updated 5 months ago
- Multi-Candidate Speculative Decoding☆33Updated 8 months ago
- Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**☆151Updated 7 months ago
- The related works and background techniques about Openai o1☆192Updated last week
- The Official Implementation of Ada-KV: Optimizing KV Cache Eviction by Adaptive Budget Allocation for Efficient LLM Inference☆54Updated 3 weeks ago
- Awesome-LLM-KV-Cache: A curated list of 📙Awesome LLM KV Cache Papers with Codes.☆183Updated last month
- ☆49Updated last month
- The official code for paper "parallel speculative decoding with adaptive draft length."☆32Updated 4 months ago
- ☆35Updated last month
- Train llm (bloom, llama, baichuan2-7b, chatglm3-6b) with deepspeed pipeline mode. Faster than zero/zero++/fsdp.☆92Updated 11 months ago
- ☆26Updated 2 months ago
- [ACL 2024 Demo] Official GitHub repo for UltraEval: An open source framework for evaluating foundation models.☆230Updated 2 months ago
- Awesome list for LLM quantization☆156Updated 3 weeks ago
- The repository has collected a batch of noteworthy MLSys bloggers (Algorithms/Systems)☆135Updated last week
- QAQ: Quality Adaptive Quantization for LLM KV Cache☆44Updated 9 months ago
- ☆33Updated this week
- [EMNLP 2024] LongAlign: A Recipe for Long Context Alignment of LLMs☆237Updated last month
- ☆43Updated 2 weeks ago
- A tiny yet powerful LLM inference system tailored for researching purpose. vLLM-equivalent performance with only 2k lines of code (2% of …☆131Updated 6 months ago
- ☆212Updated 8 months ago
- 📰 Must-read papers on KV Cache Compression (constantly updating 🤗).☆256Updated this week
- Pretrain、decay、SFT a CodeLLM from scratch 🧙♂️☆35Updated 8 months ago
- Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)☆213Updated 2 months ago
- Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).☆236Updated 10 months ago
- [ACL 2024] Long-Context Language Modeling with Parallel Encodings☆154Updated 7 months ago
- A flexible and efficient training framework for large-scale alignment tasks☆272Updated this week
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆127Updated 7 months ago
- Implement some method of LLM KV Cache Sparsity☆30Updated 7 months ago
- ☆95Updated 2 months ago
- ☆44Updated 7 months ago