☆123Jan 19, 2026Updated 4 months ago
Alternatives and similar repositories for public-tasks
Users that are interested in public-tasks are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Vivaria is METR's tool for running evaluations and conducting agent elicitation research.☆137May 18, 2026Updated last week
- METR Task Standard☆180Feb 3, 2025Updated last year
- ☆34Jun 4, 2025Updated 11 months ago
- ☆137Oct 16, 2025Updated 7 months ago
- ☆14Jul 12, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆73May 19, 2026Updated last week
- Work in progress! I don't recommend looking at the code right now.☆24May 18, 2026Updated last week
- Situational Awareness Dataset☆50Dec 14, 2024Updated last year
- Accompanying codebase for neuroscope.io, a website for displaying max activating dataset examples for language model neurons☆13Feb 13, 2023Updated 3 years ago
- 📚📚📚📚📚📚📚📚📚 Reading everything☆16Mar 11, 2026Updated 2 months ago
- ☆150Jul 23, 2025Updated 10 months ago
- Inspect: A framework for large language model evaluations☆2,096Updated this week
- Can Large Language Models Solve Security Challenges? We test LLMs' ability to interact and break out of shell environments using the Over…☆13Aug 21, 2023Updated 2 years ago
- Repo for the paper on Escalation Risks of AI systems☆44Apr 12, 2024Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- smolLM with Entropix sampler on pytorch☆149Oct 31, 2024Updated last year
- Training GPTs to solve interaction nets☆18Aug 14, 2024Updated last year
- ☆22Sep 9, 2021Updated 4 years ago
- Inverse Constitutional AI [ICLR 2025]: compressing pairwise preference data into a short constitution of principles.☆41May 6, 2026Updated 2 weeks ago
- Approximating the joint distribution of language models via MCTS☆22Nov 3, 2024Updated last year
- Collection of evals for Inspect AI☆498May 18, 2026Updated last week
- 🧠 Inspecting complexity and goal-directedness of imagination in an fNIRS BCI system.☆11Aug 26, 2023Updated 2 years ago
- ☆389Jul 2, 2024Updated last year
- Experimental LLM interface exploring new ways to use AI to improve human thinking☆20Apr 13, 2026Updated last month
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Karpathy's llama2.c transpiled to MLX for Apple Silicon☆14Dec 28, 2023Updated 2 years ago
- ☆25May 23, 2025Updated last year
- A collection of projects designed to help developers quickly get started with building deployable applications using the Anthropic API☆31Dec 27, 2024Updated last year
- Mamba support for transformer lens☆20Sep 17, 2024Updated last year
- ☆47Dec 2, 2025Updated 5 months ago
- ☆18Feb 25, 2025Updated last year
- Sparse Autoencoder Training Library☆57May 1, 2025Updated last year
- Resources for skilling up in AI alignment research engineering. Covers basics of deep learning, mechanistic interpretability, and RL.☆244Aug 11, 2025Updated 9 months ago
- Mechanistic Interpretability Visualizations using React☆348Apr 30, 2026Updated 3 weeks ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Replicating O1 inference-time scaling laws☆93Dec 1, 2024Updated last year
- (Model-written) LLM evals library☆18Dec 13, 2024Updated last year
- A Primer for Decentralized Identifiers☆10Nov 11, 2021Updated 4 years ago
- TPU pod commander is a package for managing and launching jobs on Google Cloud TPU pods.☆20Sep 24, 2025Updated 8 months ago
- Aidan Bench attempts to measure <big_model_smell> in LLMs.☆319Jun 26, 2025Updated 11 months ago
- Shaping capabilities with token-level pretraining data filtering☆93Jan 28, 2026Updated 3 months ago
- Representation Engineering: A Top-Down Approach to AI Transparency☆994Aug 14, 2024Updated last year