groq / groqflow
GroqFlow provides an automated tool flow for compiling machine learning and linear algebra workloads into Groq programs and executing those programs on GroqChip™ processors.
☆106Updated 2 months ago
Alternatives and similar repositories for groqflow:
Users that are interested in groqflow are comparing it to the libraries listed below
- Machine Learning Agility (MLAgility) benchmark and benchmarking tools☆38Updated 2 months ago
- Distributed Inference for mlx LLm☆82Updated 6 months ago
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆88Updated this week
- Fast parallel LLM inference for MLX☆162Updated 7 months ago
- look how they massacred my boy☆63Updated 4 months ago
- AI Assistant running within your browser.☆59Updated 2 months ago
- A Python library to orchestrate LLMs in a neural network-inspired structure☆46Updated 4 months ago
- ☆86Updated 4 months ago
- 🦾💻🌐 distributed training & serverless inference at scale on RunPod☆17Updated 8 months ago
- ☆65Updated 8 months ago
- ☆123Updated last week
- A repository of prompts and Python scripts for intelligent transformation of raw text into diverse formats.☆30Updated last year
- An example implementation of RLHF (or, more accurately, RLAIF) built on MLX and HuggingFace.☆23Updated 7 months ago
- Routing on Random Forest (RoRF)☆112Updated 4 months ago
- A fast minimalistic implementation of guided generation on Apple Silicon using Outlines and MLX☆51Updated last year
- Run language models on consumer hardware.☆25Updated last year
- A simple experiment on letting two local LLM have a conversation about anything!☆104Updated 7 months ago
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆31Updated 8 months ago
- inference code for mixtral-8x7b-32kseqlen☆99Updated last year
- ☆111Updated last month
- ☆46Updated 10 months ago
- ☆112Updated 6 months ago
- Minimal, clean code implementation of RAG with mlx using gguf model weights☆48Updated 9 months ago
- Transformer GPU VRAM estimator☆49Updated 10 months ago
- Docker image NVIDIA GH200 machines - optimized for vllm serving and hf trainer finetuning☆34Updated this week
- ☆38Updated 11 months ago
- A guidance compatibility layer for llama-cpp-python☆34Updated last year
- GRDN.AI app for garden optimization☆70Updated last year