Vision-Empower / Kimi-K2-MiniLinks
A miniaturized version of the Kimi-K2 model optimized for deployment on single H100 GPUs.
☆36Updated 6 months ago
Alternatives and similar repositories for Kimi-K2-Mini
Users that are interested in Kimi-K2-Mini are comparing it to the libraries listed below
Sorting:
- Official Repository for "Glyph: Scaling Context Windows via Visual-Text Compression"☆561Updated 3 months ago
- ☆34Updated 10 months ago
- Building open version of OpenAI o1 via reasoning traces (Groq, ollama, Anthropic, Gemini, OpenAI, Azure supported) Demo: https://hugging…☆188Updated last year
- [DAI 2025] Beyond GPT-5: Making LLMs Cheaper and Better via Performance–Efficiency Optimized Routing☆201Updated 2 months ago
- ☆45Updated last year
- Pivotal Token Search☆144Updated last month
- ☆131Updated 9 months ago
- Fused Qwen3 MoE layer for faster training, compatible with Transformers, LoRA, bnb 4-bit quant, Unsloth. Also possible to train LoRA over…☆231Updated last week
- Official code repository for Sketch-of-Thought (SoT)☆135Updated 9 months ago
- GRadient-INformed MoE☆264Updated last year
- Running Microsoft's BitNet via Electron, React & Astro☆52Updated 4 months ago
- Clue inspired puzzles for testing LLM deduction abilities☆45Updated 10 months ago
- Multi-Granularity LLM Debugger [ICSE2026]☆96Updated 7 months ago
- REAP: Router-weighted Expert Activation Pruning for SMoE compression☆232Updated 2 months ago
- Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement☆158Updated 4 months ago
- Transplants vocabulary between language models, enabling the creation of draft models for speculative decoding WITHOUT retraining.☆49Updated 3 months ago
- Easy to use, High Performant Knowledge Distillation for LLMs☆97Updated 9 months ago
- SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?☆259Updated last month
- OpenTinker is an RL-as-a-Service infrastructure for foundation models☆625Updated 2 weeks ago
- Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cache☆140Updated 5 months ago
- Coding problems used in aider's polyglot benchmark☆199Updated last year
- [ICLR'26] The official code implementation for "Cache-to-Cache: Direct Semantic Communication Between Large Language Models"☆341Updated last week
- Multi-Agent Step Race Benchmark: Assessing LLM Collaboration and Deception Under Pressure. A multi-player “step-race” that challenges LLM…☆85Updated 2 months ago
- ☆159Updated last year
- Prompt-to-Leaderboard☆271Updated 9 months ago
- ☆100Updated last week
- A tool to use the Ai2 Open Coding Agents Soft-Verified Efficient Repository Agents (SERA) model with Claude Code☆204Updated last week
- OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis☆256Updated this week
- Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling☆469Updated 8 months ago
- A clean, modular SDK for building AI agents with OpenHands V1.☆491Updated this week