tairov / lamatune
LLama implementations benchmarking framework
☆12Updated last year
Alternatives and similar repositories for lamatune:
Users that are interested in lamatune are comparing it to the libraries listed below
- ANE accelerated embedding models!☆17Updated 3 months ago
- Proof of concept for a generative AI application framework powered by WebAssembly and Extism☆14Updated last year
- Light WebUI for lm.rs☆23Updated 5 months ago
- Tensor library for Zig☆11Updated 4 months ago
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆25Updated 4 months ago
- iterate quickly with llama.cpp hot reloading. use the llama.cpp bindings with bun.sh☆48Updated last year
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆73Updated 3 months ago
- Inference Llama 2 in one file of zero-dependency, zero-unsafe Rust☆37Updated last year
- ☆31Updated last year
- ☆15Updated last year
- A python command-line tool to download & manage MLX AI models from Hugging Face.☆17Updated 6 months ago
- ☆25Updated 3 months ago
- First token cutoff sampling inference example☆29Updated last year
- A simple MLX implementation for pretraining LLMs on Apple Silicon.☆28Updated 2 months ago
- Using modal.com to process FineWeb-edu data☆20Updated 2 weeks ago
- Nexusflow function call, tool use, and agent benchmarks.☆19Updated 3 months ago
- Port of Facebook's LLaMA model in C/C++☆32Updated last year
- Very basic framework for composable parameterized large language model (Q)LoRA / (Q)Dora fine-tuning using mlx, mlx_lm, and OgbujiPT.☆37Updated 3 weeks ago
- A Learning Journey: Micrograd in Mojo 🔥☆61Updated 5 months ago
- ☆22Updated 9 months ago
- GGML implementation of BERT model with Python bindings and quantization.☆56Updated last year
- ☆38Updated last year
- Extracts structured data from unstructured input. Programming language agnostic. Uses llama.cpp☆44Updated 10 months ago
- ☆22Updated 5 months ago