AMD-AGI / InstellaLinks
Fully Open Language Models with Stellar Performance
☆246Updated 2 months ago
Alternatives and similar repositories for Instella
Users that are interested in Instella are comparing it to the libraries listed below
Sorting:
- ☆189Updated last year
- Inference engine for Intel devices. Serve LLMs, VLMs, Whisper, Kokoro-TTS over OpenAI endpoints.☆211Updated this week
- Docs for GGUF quantization (unofficial)☆275Updated 2 months ago
- Benchmark and optimize LLM inference across frameworks with ease☆117Updated 3 weeks ago
- ☆53Updated last week
- Welcome to the official repository of SINQ! A novel, fast and high-quality quantization method designed to make any Large Language Model …☆424Updated this week
- LocalScore is an open benchmark which helps you understand how well your computer can handle local AI tasks.☆60Updated last month
- SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?☆181Updated last week
- Pivotal Token Search☆126Updated 2 months ago
- Sparse Inferencing for transformer based LLMs☆201Updated last month
- A platform to self-host AI on easy mode☆170Updated this week
- 1.58 Bit LLM on Apple Silicon using MLX☆223Updated last year
- No-code CLI designed for accelerating ONNX workflows☆215Updated 3 months ago
- ☆440Updated last month
- PyTorch implementation of models from the Zamba2 series.☆185Updated 8 months ago
- Simple & Scalable Pretraining for Neural Architecture Research☆296Updated last month
- Reverse Engineering Gemma 3n: Google's New Edge-Optimized Language Model☆244Updated 4 months ago
- This is the documentation repository for SGLang. It is auto-generated from https://github.com/sgl-project/sglang/tree/main/docs.☆79Updated this week
- A companion toolkit to pico-train for quantifying, comparing, and visualizing how language models evolve during training.☆108Updated 6 months ago
- ☆232Updated 3 months ago
- ☆196Updated 5 months ago
- Everything you need to know about LLM inference☆237Updated last week
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆322Updated 11 months ago
- Let LLMs control embedded devices via the Model Context Protocol.☆146Updated 3 months ago
- Massive Multimodal Open RAG & Extraction A scalable multimodal pipeline for processing, indexing, and querying multimodal documents Eve…☆141Updated this week
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆82Updated last week
- InferX: Inference as a Service Platform☆136Updated this week
- Live-bending a foundation model’s output at neural network level.☆265Updated 6 months ago
- Run LLM Agents on Ryzen AI PCs in Minutes☆639Updated last week
- Smart proxy for LLM APIs that enables model-specific parameter control, automatic mode switching (like Qwen3's /think and /no_think), and…☆50Updated 4 months ago