tairov / lamatune
LLama implementations benchmarking framework
☆12Updated last year
Alternatives and similar repositories for lamatune:
Users that are interested in lamatune are comparing it to the libraries listed below
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆24Updated 2 months ago
- Nexusflow function call, tool use, and agent benchmarks.☆19Updated last month
- ANE accelerated embedding models!☆15Updated last month
- Binary vector search example using Unum's USearch engine and pre-computed Wikipedia embeddings from Co:here and MixedBread☆18Updated 9 months ago
- Light WebUI for lm.rs☆23Updated 3 months ago
- Proof of concept for a generative AI application framework powered by WebAssembly and Extism☆14Updated last year
- ☆15Updated last year
- Rust Implementation of micrograd☆51Updated 6 months ago
- Tensor library for Zig☆10Updated 2 months ago
- Rust bindings for CTranslate2☆14Updated last year
- GPU accelerated client-side embeddings for vector search, RAG etc.☆66Updated last year
- A python command-line tool to download & manage MLX AI models from Hugging Face.☆17Updated 5 months ago
- Training hybrid models for dummies.☆18Updated 2 weeks ago
- ☆38Updated 10 months ago
- iterate quickly with llama.cpp hot reloading. use the llama.cpp bindings with bun.sh☆47Updated last year
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆70Updated last month
- ☆25Updated last month
- Using multiple LLMs for ensemble Forecasting☆16Updated last year
- Easy to use, High Performant Knowledge Distillation for LLMs☆40Updated 2 weeks ago
- alternative way to calculating self attention☆18Updated 8 months ago
- Very minimal (and stateless) agent framework☆41Updated 2 weeks ago
- A sleek, customizable interface for managing LLMs with responsive design and easy agent personalization.☆12Updated 5 months ago
- Rust implementation of Surya☆56Updated 3 weeks ago
- Inference Llama 2 in one file of zero-dependency, zero-unsafe Rust☆37Updated last year
- Latent Large Language Models☆17Updated 5 months ago
- Using modal.com to process FineWeb-edu data☆19Updated last month
- The official evaluation suite and dynamic data release for MixEval.☆10Updated 4 months ago
- First token cutoff sampling inference example☆29Updated last year
- The Swarm Ecosystem☆19Updated 5 months ago
- Extracts structured data from unstructured input. Programming language agnostic. Uses llama.cpp☆43Updated 8 months ago