Picovoice / llm-compression-benchmarkLinks

LLM Compression Benchmark

☆22

Alternatives and similar repositories for llm-compression-benchmark

Users that are interested in llm-compression-benchmark are comparing it to the libraries listed below

Sorting:

flawedmatrix / mamba-ssm
Implementation of mamba with rust
☆87Updated last year
mgerstgrasser / tacheles
a lightweight, open-source blueprint for building powerful and scalable LLM chat applications
☆28Updated last year
QuixiAI / grokadamw
☆134Updated 11 months ago
teknium1 / LLM-Logbook
Public reports detailing responses to sets of prompts by Large Language Models.
☆30Updated 6 months ago
abgulati / hf-waitress
Serving LLMs in the HF-Transformers format via a PyFlask API
☆71Updated 10 months ago
kyegomez / OpenStrawberry
An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO
☆30Updated this week
simonw / llm-command-r
Access the Cohere Command R family of models
☆37Updated 3 months ago
monk1337 / auto-ollama
run ollama & gguf easily with a single command
☆52Updated last year
fairydreaming / farel-bench
Testing LLM reasoning abilities with family relationship quizzes.
☆62Updated 5 months ago
kyutai-labs / dactory
☆41Updated 2 months ago
deployradiant / pychatml
Chat Markup Language conversation library
☆55Updated last year
huggingface / optimum-tpu
Google TPU optimizations for transformers models
☆116Updated 5 months ago
chigkim / Ollama-MMLU-Pro
☆95Updated 6 months ago
zby / LLMEasyTools
Tools for LLM agents.
☆63Updated 7 months ago
jadechip / nanoXLSTM
The simplest, fastest repository for training/finetuning medium-sized xLSTMs.
☆41Updated last year
shobrook / weightgain
Train an adapter for any embedding model in under a minute
☆106Updated 3 months ago
ritabratamaiti / AnyModal
AnyModal is a Flexible Multimodal Language Model Framework for PyTorch
☆100Updated 6 months ago
tiiuae / onebitllms
Lightweight toolkit package to train and fine-tune 1.58bit Language models
☆80Updated 2 months ago
nicholasyager / llama-cpp-guidance
A guidance compatibility layer for llama-cpp-python
☆35Updated last year
N8python / mlx-pretrain
A simple MLX implementation for pretraining LLMs on Apple Silicon.
☆81Updated 2 months ago
distantmagic / llmops-handbook
Practical and advanced guide to LLMOps. It provides a solid understanding of large language models’ general concepts, deployment techniqu…
☆70Updated 11 months ago
QuixiAI / kraken
☆66Updated last year
ai-in-pm / CAG-Cache-Augmented-Generation
This project implements a demonstrator agent that compares the Cache-Augmented Generation (CAG) Framework with traditional Retrieval-Augm…
☆33Updated 6 months ago
willccbb / mlx_parallm
Fast parallel LLM inference for MLX
☆198Updated last year
inferless / triton-co-pilot
Generate Glue Code in seconds to simplify your Nvidia Triton Inference Server Deployments
☆20Updated last year
facebookresearch / fastgen
Simple high-throughput inference library
☆120Updated 2 months ago
Zyphra / Zamba2
PyTorch implementation of models from the Zamba2 series.
☆184Updated 5 months ago
BlinkDL / modded-nanogpt-rwkv
RWKV-7: Surpassing GPT
☆92Updated 8 months ago
Nero10578 / LLM-Inference-Benchmark
☆14Updated 10 months ago
abeleinin / mlx-xLSTM
MLX implementation of xLSTM model by Beck et al. (2024)
☆28Updated last year