Everything you need to know about LLM inference
☆268Mar 9, 2026Updated this week
Alternatives and similar repositories for llm-inference-handbook
Users that are interested in llm-inference-handbook are comparing it to the libraries listed below
Sorting:
- vyai – A lightweight CLI tool to interact with the Gemini API from the terminal.☆11Dec 8, 2025Updated 3 months ago
- Postgres extension that speeds up analytics queries by upto 90%☆52Jun 8, 2024Updated last year
- A parser to get the product, OS, device, cpu, and engine information from a user agent, inspired by https://github.com/faisalman/ua-parse…☆20Nov 24, 2025Updated 3 months ago
- VectorDB using dispersion models. Provides graph analysis, vector search and a energy-distribution stats for your vectors in one package.☆33Mar 2, 2026Updated last week
- Personal Site☆20Jan 11, 2026Updated last month
- The docs repository of Pulsar2 which is AXera's SoC 2rd AI toolchain. Such as AX650A, AX650N☆17Feb 12, 2026Updated 3 weeks ago
- Benchmark and optimize LLM inference across frameworks with ease☆169Sep 12, 2025Updated 5 months ago
- Monorepo☆31Aug 13, 2025Updated 6 months ago
- Rust Vector for large amounts of data, that does not copy when growing, by using full `mmap`'d pages.☆22Mar 15, 2024Updated last year
- Свободный аудио проигрыватель Вашей любимой музыки на сайте Вконтакте (http://vk.com).☆11Dec 5, 2014Updated 11 years ago
- Open-source LLM Prompt-Injection and Jailbreaking Playground☆30Jul 19, 2025Updated 7 months ago
- Token Downsampling optimization for stable-diffusion-webui☆26Apr 22, 2024Updated last year
- Your AI Chief of Staff — a personal operating system starter kit that adapts to your role. No coding required.☆128Updated this week
- Rust SDK for Threema Gateway.☆26Mar 2, 2026Updated last week
- High-performance open-source synthetic data engine. Uses LLMs for schema design and vectorized NumPy for deterministic, scalable generati…☆51Feb 15, 2026Updated 3 weeks ago
- Simple Agents Made Easy☆616Nov 5, 2025Updated 4 months ago
- Node-Based Robotics Framework Written in Rust☆71Oct 6, 2024Updated last year
- A Model Context Protocol (MCP) server that enables AI assistants to generate images, text, and audio through the Pollinations APIs. Suppo…☆39Feb 13, 2026Updated 3 weeks ago
- license is a small helper to add licenses to your work☆24May 6, 2016Updated 9 years ago
- Mobile adapter for IOS and android for mobile LLM agents☆46Nov 24, 2024Updated last year
- SnapDocs - A Modern, Open-Source Document Workspace☆25Sep 7, 2025Updated 6 months ago
- Curated list of resources and tools to implement your PARA Method workflow.☆29Jan 4, 2024Updated 2 years ago
- Command line tool for Deep Infra cloud ML inference service☆34Jun 10, 2024Updated last year
- AutoGenBook is a Python-based tool that automatically generates books using LLMs. It creates chapters, sections, and subsections recurs…☆26Nov 3, 2024Updated last year
- Achieve the llama3 inference step-by-step, grasp the core concepts, master the process derivation, implement the code.☆625Feb 24, 2025Updated last year
- Your AI research assistant☆79Mar 31, 2025Updated 11 months ago
- Text Behind Video. Enjoy it is completely free.☆31Feb 15, 2025Updated last year
- just code snippets☆19Oct 16, 2016Updated 9 years ago
- Sutracli is an AI-powered code manager for coding agents. It spawns agents for multiple projects, connects repos through cross-indexing, …☆28Nov 7, 2025Updated 4 months ago
- Model Context Protocol Server for Apache OpenDAL™☆34Apr 10, 2025Updated 10 months ago
- This is a python implementation for stitching images.☆231Oct 3, 2024Updated last year
- This project utilizes Generative Adversarial Networks (GANs) to tackle the problem of credit card fraud detection. GANs are a powerful de…☆12Oct 11, 2023Updated 2 years ago
- AI Agent Orchestration Kanban Board — Route tasks to Claude Code, Codex CLI, and Gemini CLI with role-based auto-assignment and real-time…☆39Feb 15, 2026Updated 3 weeks ago
- MCP as a Judge is a behavioral MCP that strengthens AI coding assistants by requiring explicit LLM evaluations☆16Dec 15, 2025Updated 2 months ago
- Build, Improve Performance, and Productionize your LLM Application with an Integrated Framework☆342Nov 26, 2024Updated last year
- A transparent cognitive sandbox disguised as a digital pet squid with a neural network you can see thinking☆277Updated this week
- AI Dataset Generator – Create realistic datasets for demos, learning, and dashboards☆752Oct 3, 2025Updated 5 months ago
- Fully neural approach for text chunking☆406Oct 23, 2025Updated 4 months ago
- Assembler for a 1 bit processor made around a ROM chip☆40Jan 21, 2024Updated 2 years ago