allenai / codenav
CodeNav is an LLM agent that navigates and leverages previously unseen code repositories to solve user queries.
☆43Updated 8 months ago
Alternatives and similar repositories for codenav:
Users that are interested in codenav are comparing it to the libraries listed below
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆54Updated 5 months ago
- ☆40Updated 9 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆57Updated 8 months ago
- Code for the paper: CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models☆20Updated last month
- The first dense retrieval model that can be prompted like an LM☆71Updated 7 months ago
- Small, simple agent task environments for training and evaluation☆18Updated 6 months ago
- Challenge LLMs to Reason About Reasoning: A Benchmark to Unveil Cognitive Depth in LLMs☆47Updated 9 months ago
- ☆18Updated 7 months ago
- [ICML 2023] "Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation", Wenqing Zheng, S P Sharan, Ajay Kumar Jaiswal, …☆40Updated last year
- EMNLP 2024 "Re-reading improves reasoning in large language models". Simply repeating the question to get bidirectional understanding for…☆25Updated 4 months ago
- ☆48Updated 6 months ago
- ☆13Updated 4 months ago
- ☆61Updated 9 months ago
- Mixing Language Models with Self-Verification and Meta-Verification☆104Updated 4 months ago
- Code, results and other artifacts from the paper introducing the WildChat-50m dataset and the Re-Wild model family.☆29Updated last month
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆90Updated 3 months ago
- ☆26Updated last month
- ☆25Updated 7 months ago
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆172Updated 3 months ago
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Models☆22Updated 5 months ago
- SiriuS: Self-improving Multi-agent Systems via Bootstrapped Reasoning☆53Updated last month
- Official repo for Learning to Reason for Long-Form Story Generation☆44Updated 2 weeks ago
- Official Repo for InSTA: Towards Internet-Scale Training For Agents☆35Updated last week
- ☆37Updated 2 years ago
- A new way to generate large quantities of high quality synthetic data (on par with GPT-4), with better controllability, at a fraction of …☆22Updated 7 months ago
- Lego for GRPO☆27Updated last month
- ☆50Updated 5 months ago
- ☆82Updated last year
- Codebase accompanying the Summary of a Haystack paper.☆77Updated 7 months ago
- Training an LLM to use a calculator with multi-turn reinforcement learning, achieving a **62% absolute increase in evaluation accuracy**.☆31Updated this week