AMD-AGI / InstellaLinks
Fully Open Language Models with Stellar Performance
☆318Updated 2 months ago
Alternatives and similar repositories for Instella
Users that are interested in Instella are comparing it to the libraries listed below
Sorting:
- This is the documentation repository for SGLang. It is auto-generated from https://github.com/sgl-project/sglang☆100Updated this week
- Pivotal Token Search☆145Updated last month
- SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?☆259Updated last month
- ☆170Updated 2 months ago
- Code for Bolmo: Byteifying the Next Generation of Language Models☆117Updated last month
- Simple & Scalable Pretraining for Neural Architecture Research☆308Updated 2 months ago
- ☆191Updated last year
- Welcome to the official repository of SINQ! A novel, fast and high-quality quantization method designed to make any Large Language Model …☆595Updated this week
- Reverse Engineering Gemma 3n: Google's New Edge-Optimized Language Model☆263Updated 8 months ago
- 1.58 Bit LLM on Apple Silicon using MLX☆243Updated last year
- A tool to use the Ai2 Open Coding Agents Soft-Verified Efficient Repository Agents (SERA) model with Claude Code☆220Updated this week
- Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research☆288Updated this week
- ☆1,289Updated 2 months ago
- ☆724Updated 2 months ago
- Benchmark and optimize LLM inference across frameworks with ease☆166Updated 5 months ago
- ☆219Updated last year
- Docs for GGUF quantization (unofficial)☆366Updated 6 months ago
- Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B☆573Updated 2 months ago
- PyTorch implementation of models from the Zamba2 series.☆186Updated last year
- Train, tune, and infer Bamba model☆137Updated 8 months ago
- ☆867Updated 4 months ago
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆112Updated 8 months ago
- ☆238Updated 2 months ago
- Sparse Inferencing for transformer based LLMs☆217Updated 6 months ago
- WeDLM: The fastest diffusion language model with standard causal attention and native KV cache compatibility, delivering real speedups ov…☆604Updated 3 weeks ago
- ☆466Updated 2 months ago
- ☆270Updated 7 months ago
- GRadient-INformed MoE☆264Updated last year
- 👷 Build compute kernels☆215Updated 2 weeks ago
- Massive Multimodal Open RAG & Extraction A scalable multimodal pipeline for processing, indexing, and querying multimodal documents Eve…☆186Updated 2 weeks ago