Picovoice / llm-compression-benchmark
LLM Compression Benchmark
☆21Updated 2 months ago
Alternatives and similar repositories for llm-compression-benchmark:
Users that are interested in llm-compression-benchmark are comparing it to the libraries listed below
- Implementation of mamba with rust☆85Updated last year
- Public reports detailing responses to sets of prompts by Large Language Models.☆30Updated 3 months ago
- ☆39Updated last year
- This is the code that went into our practical dive using mamba as information extraction☆54Updated last year
- Fast parallel LLM inference for MLX☆184Updated 9 months ago
- ☆52Updated 2 months ago
- ☆129Updated 8 months ago
- Testing LLM reasoning abilities with family relationship quizzes.☆62Updated 3 months ago
- Chat Markup Language conversation library☆55Updated last year
- A guidance compatibility layer for llama-cpp-python☆34Updated last year
- ☆66Updated 11 months ago
- Jupyter notebooks for cloud-based usage☆10Updated last year
- The official evaluation suite and dynamic data release for MixEval.☆11Updated 7 months ago
- ☆27Updated 8 months ago
- A fast minimalistic implementation of guided generation on Apple Silicon using Outlines and MLX☆53Updated last year
- ☆12Updated 7 months ago
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆66Updated 5 months ago
- a lightweight, open-source blueprint for building powerful and scalable LLM chat applications☆28Updated 10 months ago
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆30Updated 7 months ago
- ☆32Updated last year
- Train your own small bitnet model☆67Updated 6 months ago
- run ollama & gguf easily with a single command☆50Updated 11 months ago
- Serving LLMs in the HF-Transformers format via a PyFlask API☆71Updated 7 months ago
- Training hybrid models for dummies.☆20Updated 3 months ago
- Tools for formatting large language model prompts.☆13Updated last year
- A simple MLX implementation for pretraining LLMs on Apple Silicon.☆28Updated 3 months ago
- o1lama: Use Ollama with Llama 3.2 3B and other models locally to create reasoning chains that are similar in appearance to OpenAI's o1.☆23Updated 6 months ago
- MLX implementation of xLSTM model by Beck et al. (2024)☆27Updated 10 months ago
- Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub☆160Updated last year
- ☆48Updated 5 months ago