groq / groqflowLinks
GroqFlow provides an automated tool flow for compiling machine learning and linear algebra workloads into Groq programs and executing those programs on GroqChip™ processors.
☆112Updated 2 months ago
Alternatives and similar repositories for groqflow
Users that are interested in groqflow are comparing it to the libraries listed below
Sorting:
- Tutorial to get started with SkyPilot!☆58Updated last year
- Machine Learning Agility (MLAgility) benchmark and benchmarking tools☆40Updated 2 months ago
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆90Updated this week
- ☆89Updated last year
- ☆112Updated last year
- inference code for mixtral-8x7b-32kseqlen☆101Updated last year
- ☆123Updated last year
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆31Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆52Updated last year
- ☆116Updated 9 months ago
- 1.58 Bit LLM on Apple Silicon using MLX☆223Updated last year
- A tree-based prefix cache library that allows rapid creation of looms: hierarchal branching pathways of LLM generations.☆72Updated 7 months ago
- Transformer GPU VRAM estimator☆66Updated last year
- ☆121Updated last year
- Write a fast kernel and run it on Discord. See how you compare against the best!☆58Updated last week
- Official code for "SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient"☆145Updated last year
- 🏥 Health monitor for a Petals swarm☆39Updated last year
- Repository of model demos using TT-Buda☆62Updated 6 months ago
- A Learning Journey: Micrograd in Mojo 🔥☆62Updated 11 months ago
- AskIt: Unified programming interface for programming with LLMs (GPT-3.5, GPT-4, Gemini, Claude, Cohere, Llama 2)☆79Updated 8 months ago
- LLM inference in C/C++☆102Updated last month
- PCCL (Prime Collective Communications Library) implements fault tolerant collective communications over IP☆126Updated 3 weeks ago
- ☆36Updated last year
- 1.58-bit LLaMa model☆82Updated last year
- ☆141Updated last year
- ScalarLM - a unified training and inference stack☆85Updated last week
- Fast parallel LLM inference for MLX☆220Updated last year
- ☆67Updated last year
- Command line tool for Deep Infra cloud ML inference service☆33Updated last year
- ☆63Updated 9 months ago