groq / groqflow
GroqFlow provides an automated tool flow for compiling machine learning and linear algebra workloads into Groq programs and executing those programs on GroqChip™ processors.
☆108Updated 3 weeks ago
Alternatives and similar repositories for groqflow:
Users that are interested in groqflow are comparing it to the libraries listed below
- ☆89Updated 5 months ago
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆31Updated 10 months ago
- Tiny ASIC implementation for "The Era of 1-bit LLMs All Large Language Models are in 1.58 Bits" matrix multiplication unit☆127Updated 11 months ago
- Machine Learning Agility (MLAgility) benchmark and benchmarking tools☆38Updated 3 weeks ago
- 1.58-bit LLaMa model☆82Updated 11 months ago
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆87Updated this week
- Distributed Inference for mlx LLm☆87Updated 7 months ago
- Fast parallel LLM inference for MLX☆174Updated 8 months ago
- AI Assistant running within your browser.☆62Updated 3 months ago
- LLM inference in C/C++☆66Updated this week
- GRDN.AI app for garden optimization☆70Updated last year
- Turing machines, Rule 110, and A::B reversal using Claude 3 Opus.☆59Updated 10 months ago
- A fast minimalistic implementation of guided generation on Apple Silicon using Outlines and MLX☆53Updated last year
- A tree-based prefix cache library that allows rapid creation of looms: hierarchal branching pathways of LLM generations.☆67Updated last month
- Repository of model demos using TT-Buda☆63Updated last week
- Tutorial to get started with SkyPilot!☆57Updated 10 months ago
- Official code for "SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient"☆137Updated last year
- Minimal, clean code implementation of RAG with mlx using gguf model weights☆49Updated 11 months ago
- ☆111Updated 3 months ago
- look how they massacred my boy☆63Updated 5 months ago
- ☆66Updated 10 months ago
- Transformer GPU VRAM estimator☆58Updated last year
- ☆35Updated last year
- ☆41Updated 10 months ago
- ☆61Updated last month
- ☆113Updated 7 months ago
- Write a fast kernel and run it on Discord. See how you compare against the best!☆34Updated this week
- A pure MLX-based training pipeline for fine-tuning LLMs using GRPO on Apple Silicon.☆29Updated last month
- A guidance compatibility layer for llama-cpp-python☆34Updated last year
- 🦾💻🌐 distributed training & serverless inference at scale on RunPod☆17Updated 10 months ago