AMD-AGI / InstellaLinks
Fully Open Language Models with Stellar Performance
☆248Updated 2 months ago
Alternatives and similar repositories for Instella
Users that are interested in Instella are comparing it to the libraries listed below
Sorting:
- ☆189Updated last year
- ☆93Updated 3 weeks ago
- Massive Multimodal Open RAG & Extraction A scalable multimodal pipeline for processing, indexing, and querying multimodal documents Eve…☆155Updated this week
- Benchmark and optimize LLM inference across frameworks with ease☆124Updated last month
- Welcome to the official repository of SINQ! A novel, fast and high-quality quantization method designed to make any Large Language Model …☆555Updated this week
- Inference engine for Intel devices. Serve LLMs, VLMs, Whisper, Kokoro-TTS, Embedding and Rerank models over OpenAI endpoints.☆226Updated this week
- Sparse Inferencing for transformer based LLMs☆201Updated 2 months ago
- Pivotal Token Search☆130Updated 3 months ago
- InferX: Inference as a Service Platform☆137Updated this week
- Docs for GGUF quantization (unofficial)☆293Updated 3 months ago
- MiniMax-M2, a Mini model built for Max coding & agentic workflows.☆656Updated this week
- This is the documentation repository for SGLang. It is auto-generated from https://github.com/sgl-project/sglang/tree/main/docs.☆86Updated this week
- No-code CLI designed for accelerating ONNX workflows☆215Updated 4 months ago
- Simple & Scalable Pretraining for Neural Architecture Research☆297Updated 2 months ago
- 1.58 Bit LLM on Apple Silicon using MLX☆225Updated last year
- A platform to self-host AI on easy mode☆171Updated last week
- Local Qwen3 LLM inference. One easy-to-understand file of C source with no dependencies.☆140Updated 3 months ago
- ☆449Updated this week
- SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?☆202Updated last week
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆95Updated 5 months ago
- GRadient-INformed MoE☆264Updated last year
- ☆300Updated 2 months ago
- Live-bending a foundation model’s output at neural network level.☆269Updated 6 months ago
- Source code for Intel's Polite Guard NLP project☆37Updated 2 months ago
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆325Updated last year
- Reverse Engineering Gemma 3n: Google's New Edge-Optimized Language Model☆249Updated 5 months ago
- Everything you need to know about LLM inference☆237Updated this week
- LocalScore is an open benchmark which helps you understand how well your computer can handle local AI tasks.☆65Updated last month
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆81Updated last week
- PyTorch implementation of models from the Zamba2 series.☆185Updated 9 months ago