evanmiller / LLM-Reading-ListLinks
LLM papers I'm reading, mostly on inference and model compression
☆730Updated last year
Alternatives and similar repositories for LLM-Reading-List
Users that are interested in LLM-Reading-List are comparing it to the libraries listed below
Sorting:
- What would you do with 1000 H100s...☆1,050Updated last year
- A comprehensive deep dive into the world of tokens☆224Updated 11 months ago
- ☆536Updated 9 months ago
- An ML Systems Onboarding list☆794Updated 4 months ago
- 🤖 A PyTorch library of curated Transformer models and their composable components☆890Updated last year
- Puzzles for exploring transformers☆348Updated 2 years ago
- Deep learning for dummies. All the practical details and useful utilities that go into working with real models.☆795Updated last month
- Fine-tune mistral-7B on 3090s, a100s, h100s☆713Updated last year
- [ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding☆1,251Updated 3 months ago
- Puzzles for learning Triton☆1,671Updated 6 months ago
- GPU programming related news and material links☆1,540Updated 5 months ago
- A simple and effective LLM pruning approach.☆756Updated 9 months ago
- Llama from scratch, or How to implement a paper without crying☆567Updated last year
- Finetuning Large Language Models on One Consumer GPU in 2 Bits☆721Updated last year
- An interactive exploration of Transformer programming.☆264Updated last year
- Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI☆1,387Updated last year
- Serving multiple LoRA finetuned LLM as one☆1,062Updated last year
- Fast & Simple repository for pre-training and fine-tuning T5-style models☆1,005Updated 9 months ago
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day☆256Updated last year
- Building blocks for foundation models.☆502Updated last year
- Official implementation of Half-Quadratic Quantization (HQQ)☆818Updated this week
- Best practices for distilling large language models.☆547Updated last year
- The repository for the code of the UltraFastBERT paper☆516Updated last year
- A bibliography and survey of the papers surrounding o1☆1,194Updated 6 months ago
- Notes from the Latent Space paper club. Follow along or start your own!☆234Updated 10 months ago
- YaRN: Efficient Context Window Extension of Large Language Models☆1,495Updated last year
- Ungreedy subword tokenizer and vocabulary trainer for Python, Go & Javascript☆582Updated 11 months ago
- Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wi…☆345Updated 10 months ago
- A benchmark to evaluate language models on questions I've previously asked them to solve.☆1,014Updated last month
- A family of open-sourced Mixture-of-Experts (MoE) Large Language Models☆1,534Updated last year