Just-Curieous / CurieLinks
❓Curie: Automated and Rigorous Scientific Experimentation with AI Agents
☆152Updated this week
Alternatives and similar repositories for Curie
Users that are interested in Curie are comparing it to the libraries listed below
Sorting:
- Systematic evaluation framework that automatically rates overthinking behavior in large language models.☆89Updated 3 weeks ago
- ☆67Updated 3 months ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆173Updated 2 months ago
- A lightweight framework for building research agents designed for developers☆94Updated this week
- Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cache☆105Updated last month
- Scaling Data for SWE-agents☆220Updated this week
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆60Updated last week
- accompanying material for sleep-time compute paper☆90Updated last month
- Simple extension on vLLM to help you speed up reasoning model without training.☆152Updated this week
- A benchmark for LLMs on complicated tasks in the terminal☆141Updated this week
- Multi-Faceted AI Agent and Workflow Autotuning. Automatically optimizes LangChain, LangGraph, DSPy programs for better quality, lower exe…☆236Updated 3 weeks ago
- ☆47Updated 3 weeks ago
- [ACL25' Findings] SWE-Dev is an SWE agent with a scalable test case construction pipeline.☆29Updated 3 weeks ago
- LLM reads a paper and produce a working prototype☆57Updated last month
- Samples of good AI generated CUDA kernels☆65Updated last week
- Challenges for general-purpose web-browsing AI agents☆58Updated last week
- Tina: Tiny Reasoning Models via LoRA☆245Updated last week
- Train your own SOTA deductive reasoning model☆92Updated 3 months ago
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆207Updated last month
- ☆191Updated 2 weeks ago
- ☆59Updated 2 weeks ago
- ☆99Updated last week
- A scalable asynchronous reinforcement learning implementation with in-flight weight updates.☆119Updated this week
- Repository for Zochi's Research☆150Updated last week
- ☆145Updated last month
- Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆90Updated 2 months ago
- Sandboxed code execution for AI agents, locally or on the cloud. Massively parallel, easy to extend. Powering SWE-agent and more.☆204Updated last week
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆69Updated 2 weeks ago
- ☆93Updated last week
- ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)☆105Updated this week