leoheuler / flashtensorsLinks
☆440Updated 2 months ago
Alternatives and similar repositories for flashtensors
Users that are interested in flashtensors are comparing it to the libraries listed below
Sorting:
- Plug-and-play memory for LLMs in 3 lines of code. Add persistent, intelligent, human-like memory and recall to any model in minutes.☆252Updated 2 weeks ago
- A command-line interface tool for serving LLM using vLLM.☆468Updated last week
- Enhancing LLMs with LoRA☆206Updated 3 months ago
- Sparse Inferencing for transformer based LLMs☆218Updated 5 months ago
- Liquid Audio - Speech-to-Speech audio models by Liquid AI☆388Updated last week
- llmbasedos — Local-First OS Where Your AI Agents Wake Up and Work☆282Updated 3 weeks ago
- Multi-agent autonomous research system using LangGraph and LangChain. Generates citation-backed reports with credibility scoring and web …☆124Updated last month
- [ICLR2026] Test-Time Scaling with Reflective Generative Model☆302Updated last week
- ☆205Updated 4 months ago
- ~950 line, minimal, extensible LLM inference engine built from scratch.☆405Updated 3 weeks ago
- ☆178Updated 5 months ago
- A CLI to estimate inference memory requirements for Hugging Face models, written in Python.☆646Updated this week
- Welcome to the official repository of SINQ! A novel, fast and high-quality quantization method designed to make any Large Language Model …☆590Updated 2 weeks ago
- A modern web interface for managing and interacting with vLLM servers (www.github.com/vllm-project/vllm). Supports both GPU and CPU modes…☆351Updated this week
- Inference, Fine Tuning and many more recipes with Gemma family of models☆279Updated 6 months ago
- Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B☆569Updated 2 months ago
- Code to accompany the Universal Deep Research paper (https://arxiv.org/abs/2509.00244)☆459Updated 5 months ago
- InferX: Inference as a Service Platform☆154Updated this week
- Open-source CLI toolkit for low-RAM finetuning, quantization, and deployment of LLMs☆92Updated 6 months ago
- From-scratch implementation of OpenAI's GPT-OSS model in Python. No Torch, No GPUs.☆108Updated 3 months ago
- Open Source Local Data Analysis Assistant.☆187Updated 3 months ago
- Laddr is a python framework for building multi-agent systems where agents communicate, delegate tasks, and execute work in parallel. Thin…☆337Updated 2 months ago
- VLLM Port of the Chatterbox TTS model☆364Updated 3 months ago
- A tool to use the Ai2 Open Coding Agents Soft-Verified Efficient Repository Agents (SERA) model with Claude Code☆108Updated last week
- BUDDIE is the first full-stack open-source AI voice interaction solution, providing a complete end-to-end system from hardware design to …☆286Updated 5 months ago
- An open-source implementation of Whisper☆477Updated 3 months ago
- It takes a village to raise a child: Google DeepThink 🧠 but in LangGraph and free - an original algorithm for collaborative agents using…☆135Updated 2 weeks ago
- ☆162Updated 3 months ago
- ☆385Updated 3 months ago
- LLMRouter: An Open-Source Library for LLM Routing☆1,209Updated this week