MoonshotAI / K2-Vendor-VerifierLinks
Verify Precision of all Kimi K2 API Vendor
☆340Updated 2 weeks ago
Alternatives and similar repositories for K2-Vendor-Verifier
Users that are interested in K2-Vendor-Verifier are comparing it to the libraries listed below
Sorting:
- The building blocks of AI agent.☆47Updated this week
- ☆283Updated last week
- SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?☆212Updated this week
- Train Large Language Models on MLX.☆205Updated last month
- Coding problems used in aider's polyglot benchmark☆187Updated 10 months ago
- Super basic implementation (gist-like) of RLMs with REPL environments.☆242Updated 3 weeks ago
- A pure MLX-based training pipeline for fine-tuning LLMs using GRPO on Apple Silicon.☆195Updated 2 weeks ago
- Sandboxed code execution for AI agents, locally or on the cloud. Massively parallel, easy to extend. Powering SWE-agent and more.☆352Updated last week
- A simple tool that let's you explore different possible paths that an LLM might sample.☆190Updated 6 months ago
- ☆135Updated 6 months ago
- LLMProc: Unix-inspired runtime that treats LLMs as processes.☆34Updated 3 months ago
- Prompt-to-Leaderboard☆260Updated 6 months ago
- Run AI generated code in isolated sandboxes☆117Updated 9 months ago
- ☆453Updated 2 weeks ago
- Routing on Random Forest (RoRF)☆218Updated last year
- ☆68Updated 5 months ago
- ☆231Updated 4 months ago
- CursorCore: Assist Programming through Aligning Anything☆131Updated 8 months ago
- Pivotal Token Search☆131Updated 3 months ago
- ☆107Updated last week
- Train your own SOTA deductive reasoning model☆109Updated 8 months ago
- ☆158Updated 6 months ago
- Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cache☆130Updated 2 months ago
- Hallucinations (Confabulations) Document-Based Benchmark for RAG. Includes human-verified questions and answers.☆230Updated 3 months ago
- llmbasedos — Local-First OS Where Your AI Agents Wake Up and Work☆277Updated 2 months ago
- ☆176Updated 2 months ago
- Letting Claude Code develop his own MCP tools :)☆123Updated 8 months ago
- Distributed Inference for mlx LLm☆97Updated last year
- AI benchmark runtime framework that allows you to integrate and evaluate AI tasks using Docker-based benchmarks.☆164Updated 5 months ago
- Sparse Inferencing for transformer based LLMs☆201Updated 3 months ago