PrunaAI / awesome-ai-efficiencyLinks
A curated list of materials on AI efficiency
☆52Updated this week
Alternatives and similar repositories for awesome-ai-efficiency
Users that are interested in awesome-ai-efficiency are comparing it to the libraries listed below
Sorting:
- I learn about and explain quantization☆26Updated last year
- An introduction to LLM Sampling☆78Updated 5 months ago
- LoRA and DoRA from Scratch Implementations☆204Updated last year
- Pruna is a model optimization framework built for developers, enabling you to deliver faster, more efficient models with minimal overhead…☆703Updated this week
- ☆66Updated last year
- Repository containing awesome resources regarding Hugging Face tooling.☆47Updated last year
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆185Updated this week
- Google TPU optimizations for transformers models☆112Updated 4 months ago
- Lightweight package that tracks and summarizes code changes using LLMs (Large Language Models)☆34Updated 3 months ago
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆66Updated 7 months ago
- Recaption large (Web)Datasets with vllm and save the artifacts.☆52Updated 6 months ago
- Notebook and Scripts that showcase running quantized diffusion models on consumer GPUs☆38Updated 7 months ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆100Updated 3 months ago
- Using modal.com to process FineWeb-edu data☆20Updated 2 months ago
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆36Updated last year
- Fine-tune an LLM to perform batch inference and online serving.☆111Updated last week
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆72Updated 2 weeks ago
- Nexusflow function call, tool use, and agent benchmarks.☆19Updated 5 months ago
- ☆61Updated 6 months ago
- My personal site☆75Updated 10 months ago
- Gradio UI for a Cog API☆66Updated last year
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorch☆95Updated 5 months ago
- working implimention of deepseek MLA☆42Updated 4 months ago
- ☆179Updated this week
- Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for tr…☆62Updated 7 months ago
- A CLI for generating synthetic data☆41Updated 3 weeks ago
- Easy to use, High Performant Knowledge Distillation for LLMs☆85Updated last month
- LLM training in simple, raw C/CUDA☆14Updated 6 months ago
- ☆77Updated last year
- ☆36Updated 2 weeks ago