WhitzardIndex / self-replication-research
A preprint version of our recent research on the capability of frontier AI systems to do self-replication
☆59Updated 4 months ago
Alternatives and similar repositories for self-replication-research:
Users that are interested in self-replication-research are comparing it to the libraries listed below
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆90Updated 3 months ago
- A simple MLX implementation for pretraining LLMs on Apple Silicon.☆28Updated 3 months ago
- MLX port for xjdr's entropix sampler (mimics jax implementation)☆64Updated 5 months ago
- Conduct in-depth research with AI-driven insights : DeepDive is a command-line tool that leverages web searches and AI models to generate…☆42Updated 7 months ago
- An example implementation of RLHF (or, more accurately, RLAIF) built on MLX and HuggingFace.☆25Updated 10 months ago
- Glyphs, acting as collaboratively defined symbols linking related concepts, add a layer of multidimensional semantic richness to user-AI …☆49Updated 2 months ago
- entropix style sampling + GUI☆25Updated 5 months ago
- MCP Server to run python code locally☆51Updated 4 months ago
- LLM reads a paper and produce a working prototype☆52Updated last week
- A tree-based prefix cache library that allows rapid creation of looms: hierarchal branching pathways of LLM generations.☆68Updated 2 months ago
- LLM Divergent Thinking Creativity Benchmark. LLMs generate 25 unique words that start with a given letter with no connections to each oth…☆32Updated last month
- A Python library to orchestrate LLMs in a neural network-inspired structure☆46Updated 6 months ago
- Automated Capability Discovery via Foundation Model Self-Exploration☆45Updated 2 months ago
- A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.☆22Updated 2 weeks ago
- ☆97Updated 6 months ago
- Thematic Generalization Benchmark: measures how effectively various LLMs can infer a narrow or specific "theme" (category/rule) from a sm…☆44Updated this week
- A pure MLX-based training pipeline for fine-tuning LLMs using GRPO on Apple Silicon.☆35Updated 2 months ago
- ☆29Updated 4 months ago
- ☆50Updated 5 months ago
- The original BabyAGI, updated with LiteLLM and no vector database reliance (csv instead)☆21Updated 6 months ago
- Letting Claude Code develop his own MCP tools :)☆99Updated last month
- ☆85Updated 7 months ago
- An automated tool for discovering insights from research papaer corpora☆138Updated 10 months ago
- Clue inspired puzzles for testing LLM deduction abilities☆33Updated last month
- ☆38Updated 8 months ago
- OpenPipe Reinforcement Learning Experiments☆22Updated last month
- Simple demo showing how to use the Forge API by Nous Research☆11Updated 5 months ago
- Benchmark that evaluates LLMs using 651 NYT Connections puzzles extended with extra trick words☆80Updated this week
- SiriuS: Self-improving Multi-agent Systems via Bootstrapped Reasoning☆52Updated 2 weeks ago
- look how they massacred my boy☆63Updated 6 months ago