davidhershey / ClaudePlaysPokemonStarter
☆107Updated last month
Alternatives and similar repositories for ClaudePlaysPokemonStarter
Users that are interested in ClaudePlaysPokemonStarter are comparing it to the libraries listed below
Sorting:
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆140Updated 2 months ago
- smol models are fun too☆92Updated 6 months ago
- smolLM with Entropix sampler on pytorch☆151Updated 6 months ago
- Benchmark that evaluates LLMs using 651 NYT Connections puzzles extended with extra trick words☆85Updated last week
- Claude Deep Research config for Claude Code.☆174Updated 2 months ago
- Verdict is a library for scaling judge-time compute.☆211Updated 2 weeks ago
- ☆97Updated 7 months ago
- explore token trajectory trees on instruct and base models☆106Updated this week
- A preprint version of our recent research on the capability of frontier AI systems to do self-replication☆59Updated 5 months ago
- Coding problems used in aider's polyglot benchmark☆115Updated 4 months ago
- Multi-Agent Step Race Benchmark: Assessing LLM Collaboration and Deception Under Pressure. A multi-player “step-race” that challenges LLM…☆49Updated last week
- A simple MLX implementation for pretraining LLMs on Apple Silicon.☆75Updated 2 weeks ago
- Train your own SOTA deductive reasoning model☆92Updated 2 months ago
- entropix style sampling + GUI☆26Updated 6 months ago
- MLX port for xjdr's entropix sampler (mimics jax implementation)☆64Updated 6 months ago
- ☆111Updated 4 months ago
- ⚖️ Awesome LLM Judges ⚖️☆97Updated 2 weeks ago
- Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse …☆357Updated this week
- A collection of projects designed to help developers quickly get started with building deployable applications using the Anthropic API☆32Updated 4 months ago
- ☆148Updated 2 months ago
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆172Updated 4 months ago
- Hallucinations (Confabulations) Document-Based Benchmark for RAG. Includes human-verified questions and answers.☆151Updated last week
- ☆84Updated 2 weeks ago
- Draw more samples☆189Updated 10 months ago
- Modify Entropy Based Sampling to work with Mac Silicon via MLX☆50Updated 6 months ago
- Letting Claude Code develop his own MCP tools :)☆100Updated 2 months ago
- ☆64Updated last month
- Worker to orchestrate and manage running an arbitrary number of LLM-generated builds concurrently using containerized Minecraft Servers.☆169Updated 5 months ago
- Open source interpretability artefacts for R1.☆109Updated 3 weeks ago
- look how they massacred my boy☆63Updated 7 months ago