groq / groqflow
GroqFlow provides an automated tool flow for compiling machine learning and linear algebra workloads into Groq programs and executing those programs on GroqChip™ processors.
☆99Updated last month
Related projects: ⓘ
- ☆49Updated this week
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆31Updated 4 months ago
- Tiny ASIC implementation for "The Era of 1-bit LLMs All Large Language Models are in 1.58 Bits" matrix multiplication unit☆103Updated 5 months ago
- One click templates for inferencing Language Models☆97Updated last week
- 1.58-bit LLaMa model☆77Updated 5 months ago
- Machine Learning Agility (MLAgility) benchmark and benchmarking tools☆37Updated last month
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆87Updated this week
- Tutorial for building LLM router☆145Updated 2 months ago
- RAG example using DSPy, Gradio, FastAPI☆57Updated 5 months ago
- Train your own small bitnet model☆47Updated 3 months ago
- a lightweight, open-source blueprint for building powerful and scalable LLM chat applications☆30Updated 3 months ago
- ☆64Updated 3 months ago
- ☆101Updated 6 months ago
- 1.58 Bit LLM on Apple Silicon using MLX☆97Updated 4 months ago
- Google TPU optimizations for transformers models☆62Updated this week
- ☆26Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆48Updated 9 months ago
- Tutorial to get started with SkyPilot!☆54Updated 4 months ago
- ☆32Updated 7 months ago
- ☆134Updated last week
- ☆50Updated 4 months ago
- Fast parallel LLM inference for MLX☆118Updated 2 months ago
- A repository of prompts and Python scripts for intelligent transformation of raw text into diverse formats.☆29Updated last year
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆28Updated 4 months ago
- A toolkit for building AI agents that use devices☆93Updated this week
- inference code for mixtral-8x7b-32kseqlen☆97Updated 9 months ago
- A library for benchmarking the Long Term Memory and Continual learning capabilities of LLM based agents. With all the tests and code you…☆53Updated this week
- GRDN.AI app for garden optimization☆68Updated 7 months ago
- ☆95Updated this week
- 5X faster 60% less memory QLoRA finetuning☆21Updated 3 months ago