gautierdag/plancraft

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/gautierdag/plancraft)

gautierdag / plancraft

Plancraft is a minecraft environment and agent suite to test planning capabilities in LLMs

☆30

Alternatives and similar repositories for plancraft

Users that are interested in plancraft are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

assistive-autonomy / assistax
View on GitHub
Assistax: A Hardware-Accelerated Reinforcement Learning Benchmark for Assistive Robotic
☆17Jul 16, 2026Updated last week
franciellevargas / FactNews
View on GitHub
FactNews is the first dataset to predict sentence-level factuality of news reporting. Furthemore, we provide baseline results for sentenc…
☆12Jun 12, 2025Updated last year
Tinycompany-AI / tokenadapt
View on GitHub
0-Shot Tokenizer Transplant
☆14May 16, 2025Updated last year
Tinycompany-AI / SuperTokenizer
View on GitHub
Multi-Word Probabilistic based supertokenizer
☆15May 15, 2025Updated last year
zhourunlong / Reflect-RL
View on GitHub
Reflect-RL: Two-Player Online RL Fine-Tuning for LMs
☆18Jul 19, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
gautierdag / tokenizer-bench
View on GitHub
Code for the paper "Getting the most out of your tokenizer for pre-training and domain adaptation"
☆22Feb 14, 2024Updated 2 years ago
psoulos / role-decomposition
View on GitHub
☆11Feb 11, 2020Updated 6 years ago
bryanoliveira / sliding-puzzles-gym
View on GitHub
A scalable benchmark for state representation learning in visual reinforcement learning.
☆17Jun 23, 2025Updated last year
Aaron617 / text2world
View on GitHub
[ACL 2025 Findings] Text2World: Benchmarking Large Language Models for Symbolic World Model Generation
☆29Feb 25, 2025Updated last year
openreview / openreview
View on GitHub
The most recent documentation of OpenReview
☆34Updated this week
Aleph-Alpha-Research / trigrams
View on GitHub
☆60Nov 18, 2025Updated 8 months ago
safety-research / SHADE-Arena
View on GitHub
☆26Jun 22, 2025Updated last year
kyegomez / LM-Infinite
View on GitHub
Implementation of "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"
☆40Nov 11, 2024Updated last year
cxcscmu / General-AgentBench
View on GitHub
Benchmark Test-Time Scaling of General LLM Agents
☆20Apr 14, 2026Updated 3 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
jinzhuoran / RAG-RewardBench
View on GitHub
RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment
☆18Dec 19, 2024Updated last year
eduagarcia / lm-evaluation-harness-pt
View on GitHub
The evalution suite for the 🚀 Open Portuguese LLM Leaderboard
☆25Aug 31, 2025Updated 10 months ago
ServiceNow / drbench
View on GitHub
An enterprise deep research benchmark
☆40Apr 22, 2026Updated 3 months ago
stellalisy / PrefPalette
View on GitHub
☆21Apr 3, 2026Updated 3 months ago
shoaibahmed / llm_depth_pruning
View on GitHub
Official implementation of the paper: "A deeper look at depth pruning of LLMs"
☆15Jul 24, 2024Updated 2 years ago
WangWenhao0716 / PDF-Embedding
View on GitHub
[NeurIPS 2024] The official implementation of "Image Copy Detection for Diffusion Models"
☆18Oct 1, 2024Updated last year
Harry67Hu / CORY
View on GitHub
Official implementation of the NeurIPS 2024 paper CORY
☆33Mar 4, 2026Updated 4 months ago
M3-IT / YING-VLM
View on GitHub
Vision Large Language Models trained on M3IT instruction tuning dataset
☆17Aug 16, 2023Updated 2 years ago
aldro61 / PaperAtlas
View on GitHub
☆24Dec 21, 2025Updated 7 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
neulab / ToM-Language-Acquisition
View on GitHub
Code used to run experiments for the ICLR 2023 paper "Computational Language Acquisition with Theory of Mind".
☆15Apr 27, 2023Updated 3 years ago
hkust-nlp / model-task-align-rl
View on GitHub
[ICLR 26] The official code repository for the paper "Mirage or Method? How Model–Task Alignment Induces Divergent RL Conclusions".
☆18Feb 9, 2026Updated 5 months ago
HKUST-KnowComp / ActPlan-1K
View on GitHub
☆11Oct 7, 2024Updated last year
michaelsaxon / CoCoCroLa
View on GitHub
The Conceptual Coverage Across Languages Benchmark for Text-to-Image Models
☆12Oct 28, 2024Updated last year
mathllm / MathCoder2
View on GitHub
☆71Oct 16, 2024Updated last year
linkedin / ControlLLM
View on GitHub
Control LLM
☆23Apr 6, 2025Updated last year
CPF-NLPR / ULGN4DocEFI
View on GitHub
☆10Nov 14, 2021Updated 4 years ago
eliasgoldsztejn95 / PTDRL
View on GitHub
Hospital simulator with pedestrians and robot
☆15Oct 20, 2024Updated last year
zjunlp / OneEdit
View on GitHub
OneEdit: A Neural-Symbolic Collaboratively Knowledge Editing System.
☆20Oct 14, 2024Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
liushulinle / MarsRL
View on GitHub
MarsRL: Advancing Multi-Agent Reasoning System via Reinforcement Learning with Agentic Pipeline Parallelism
☆18Nov 18, 2025Updated 8 months ago
rhyang2021 / ARIA
View on GitHub
Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".
☆30Aug 9, 2025Updated 11 months ago
zjunlp / WorfBench
View on GitHub
[ICLR 2025] Benchmarking Agentic Workflow Generation
☆155Feb 19, 2025Updated last year
NikTechT / drone_control
View on GitHub
Hand Gesture Controlled Tello Drone using Python and OpenCV 2021
☆12Jun 6, 2022Updated 4 years ago
liuzuxin / Bullet-Safety-Gym
View on GitHub
An open-source framework to benchmark and assess safety specifications of Reinforcement Learning problems.
☆14Aug 25, 2023Updated 2 years ago
Qichuzyy / POA
View on GitHub
Official implementation of ECCV24 paper: POA
☆24Aug 8, 2024Updated last year
YJiangcm / BMC
View on GitHub
[ICLR 2025] Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization
☆12Jan 26, 2025Updated last year