microsoft / CoML
Interactive coding assistant for data scientists and machine learning developers, empowered by large language models.
☆74Updated 4 months ago
Related projects: ⓘ
- Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents☆102Updated 3 months ago
- Codebase accompanying the Summary of a Haystack paper.☆65Updated 2 months ago
- Official repo of Respond-and-Respond: data, code, and evaluation☆92Updated last month
- [ICML 2024 Oral] A framework for society simulation that supports complex simulation, for example: multi-scene.☆39Updated last month
- EcoAssistant: using LLM assistant more affordably and accurately☆127Updated 2 months ago
- The code for "Can Large Language Model Agents Simulate Human Trust Behaviors?"☆30Updated last month
- A benchmark for evaluating learning agents based on just language feedback☆50Updated last month
- A re-implementation of Meta-Prompt in LangChain for building self-improving agents.☆57Updated last year
- A framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings.☆28Updated this week
- Public code repo for paper "SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales"☆82Updated 2 months ago
- "Improving Mathematical Reasoning with Process Supervision" by OPENAI☆55Updated last week
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆73Updated 2 months ago
- ☆90Updated last month
- ☆85Updated 7 months ago
- [ICLR 2024] Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding☆138Updated 6 months ago
- Flow of Reasoning: Efficient Training of LLM Policy with Diverse Thinking☆25Updated this week
- Official implementation of paper "Meta Prompting for AI Systems" (https://arxiv.org/abs/2311.11482)☆75Updated this week
- 🔧 Compare how Agent systems perform on several benchmarks. 📊🚀☆41Updated 2 months ago
- A compilation of the best multi-agent papers☆169Updated this week
- Beating the GAIA benchmark with Transformers Agents. 🚀☆56Updated 2 weeks ago
- An Analytical Evaluation Board of Multi-turn LLM Agents☆227Updated 4 months ago
- Repository for the paper Stream of Search: Learning to Search in Language☆70Updated last month
- Gentopia Agent Zoo and Agent Benchmark☆27Updated last year
- 🌍 Repository for "AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agent", ACL'24 Best Resource Pap…☆81Updated last month
- Fun project to run your own LLM chat bot using llama.cpp☆11Updated last year
- Evaluation and analysis code for LLM360☆75Updated 3 months ago
- ☆74Updated 9 months ago
- Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Models☆84Updated 11 months ago
- Benchmarks, environments, and toolkits for general computer agents☆154Updated this week
- Evaluating LLMs with CommonGen-Lite☆83Updated 6 months ago