Picovoice / llm-compression-benchmark
LLM Compression Benchmark
☆21Updated 8 months ago
Alternatives and similar repositories for llm-compression-benchmark:
Users that are interested in llm-compression-benchmark are comparing it to the libraries listed below
- This project implements a demonstrator agent that compares the Cache-Augmented Generation (CAG) Framework with traditional Retrieval-Augm…☆24Updated last month
- Training hybrid models for dummies.☆18Updated 2 weeks ago
- ☆33Updated last year
- Iterate fast on your RAG pipelines☆22Updated last month
- Simple LLM inference server☆20Updated 7 months ago
- The simplest, fastest repository for training/finetuning medium-sized xLSTMs.☆38Updated 8 months ago
- ☆51Updated 2 months ago
- look how they massacred my boy☆63Updated 3 months ago
- A flexible, adaptive classification system for dynamic text classification☆53Updated last week
- Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models☆69Updated last year
- Chat Markup Language conversation library☆55Updated last year
- ☆65Updated 8 months ago
- ☆27Updated 5 months ago
- ☆82Updated last week
- Public reports detailing responses to sets of prompts by Large Language Models.☆29Updated 3 weeks ago
- ☆122Updated 5 months ago
- PageRank for LLMs☆35Updated this week
- ☆21Updated 7 months ago
- A guidance compatibility layer for llama-cpp-python☆34Updated last year
- Complex RAG backend☆28Updated 10 months ago
- Conduct in-depth research with AI-driven insights : DeepDive is a command-line tool that leverages web searches and AI models to generate…☆36Updated 5 months ago
- A fast minimalistic implementation of guided generation on Apple Silicon using Outlines and MLX☆51Updated 11 months ago
- Implementation of mamba with rust☆75Updated 10 months ago
- Function Calling Benchmark & Testing☆79Updated 6 months ago
- Using modal.com to process FineWeb-edu data☆19Updated last month
- a lightweight, open-source blueprint for building powerful and scalable LLM chat applications☆30Updated 7 months ago
- The code repository for the CURLoRA research paper. Stable LLM continual fine-tuning and catastrophic forgetting mitigation.☆41Updated 5 months ago
- Google TPU optimizations for transformers models☆90Updated last week
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆70Updated last month
- MLX implementation of xLSTM model by Beck et al. (2024)☆26Updated 7 months ago