UmerHA / quanting-notes
I learn about and explain quantization
☆26Updated 10 months ago
Alternatives and similar repositories for quanting-notes:
Users that are interested in quanting-notes are comparing it to the libraries listed below
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 7 months ago
- An introduction to LLM Sampling☆76Updated 2 months ago
- Collection of autoregressive model implementation☆81Updated 3 weeks ago
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆35Updated 10 months ago
- ☆24Updated last year
- ☆48Updated last year
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆82Updated last year
- Repository containing awesome resources regarding Hugging Face tooling.☆46Updated last year
- ☆87Updated last year
- ☆84Updated 5 months ago
- Set of scripts to finetune LLMs☆36Updated 11 months ago
- ☆76Updated 9 months ago
- ☆48Updated 4 months ago
- ☆38Updated 7 months ago
- ☆19Updated 7 months ago
- QLoRA for Masked Language Modeling☆21Updated last year
- Cerule - A Tiny Mighty Vision Model☆67Updated 6 months ago
- alternative way to calculating self attention☆18Updated 9 months ago
- ☆63Updated 5 months ago
- Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detectio…☆80Updated 9 months ago
- ☆34Updated last year
- ☆27Updated 3 months ago
- ☆18Updated 4 months ago
- [WIP] Transformer to embed Danbooru labelsets☆13Updated 11 months ago
- ML/DL Math and Method notes☆58Updated last year
- Highly commented implementations of Transformers in PyTorch☆132Updated last year
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.☆17Updated last month
- Minimal example scripts of the Hugging Face Trainer, focused on staying under 150 lines☆198Updated 10 months ago
- Fast approximate inference on a single GPU with sparsity aware offloading☆38Updated last year