agiresearch / Formal-LLM
Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents
☆122Updated 9 months ago
Alternatives and similar repositories for Formal-LLM:
Users that are interested in Formal-LLM are comparing it to the libraries listed below
- ☆154Updated 7 months ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆167Updated last month
- ☆115Updated 8 months ago
- Mixing Language Models with Self-Verification and Meta-Verification☆102Updated 3 months ago
- An Analytical Evaluation Board of Multi-turn LLM Agents [NeurIPS 2024 Oral]☆300Updated 10 months ago
- AWM: Agent Workflow Memory☆253Updated 2 months ago
- ☆81Updated last year
- [ICLR'24 Spotlight] A language model (LM)-based emulation framework for identifying the risks of LM agents with tool use☆137Updated last year
- Official code for the paper "ADaPT: As-Needed Decomposition and Planning with Language Models"☆75Updated last year
- My implementation of "Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language Models"☆98Updated last year
- r2e: turn any github repository into a programming agent environment☆108Updated last month
- An implemtation of Everyting of Thoughts (XoT).☆141Updated last year
- Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examples☆83Updated 2 weeks ago
- ☆177Updated 2 months ago
- Evaluating LLMs with CommonGen-Lite☆89Updated last year
- Official Implementation of Dynamic LLM-Agent Network: An LLM-agent Collaboration Framework with Agent Team Optimization☆135Updated 10 months ago
- Systematic evaluation framework that automatically rates overthinking behavior in large language models.☆84Updated last week
- Open Implementations of LLM Analyses☆102Updated 6 months ago
- Data preparation code for CrystalCoder 7B LLM☆44Updated 10 months ago
- [NeurIPS 2023 D&B] Code repository for InterCode benchmark https://arxiv.org/abs/2306.14898☆212Updated 11 months ago
- ☆39Updated 8 months ago
- Functional Benchmarks and the Reasoning Gap☆84Updated 6 months ago
- RepoQA: Evaluating Long-Context Code Understanding☆107Updated 5 months ago
- ☆121Updated 10 months ago
- StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback☆64Updated 7 months ago
- ☆81Updated last month
- "Improving Mathematical Reasoning with Process Supervision" by OPENAI☆108Updated last month
- From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging☆72Updated 2 weeks ago
- Enhancing AI Software Engineering with Repository-level Code Graph☆151Updated last week
- ☆111Updated last month