jamesmurdza / agenteval
Automated testing and benchmarking for code generation agents.
☆17Updated last year
Related projects: ⓘ
- Using multiple LLMs for ensemble Forecasting☆17Updated 8 months ago
- Track the progress of LLM context utilisation☆53Updated 2 months ago
- a version of baby agi using dspy and typed predictors☆17Updated 6 months ago
- Writing Blog Posts with Generative Feedback Loops!☆41Updated 6 months ago
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Models☆20Updated 7 months ago
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems.☆48Updated 3 weeks ago
- Official homepage for "Self-Harmonized Chain of Thought"☆45Updated this week
- ☆37Updated last month
- ☆57Updated last year
- ☆48Updated 11 months ago
- Leveraging DSPy for AI-driven task understanding and solution generation, the Self-Discover Framework automates problem-solving through r…☆53Updated 2 months ago
- ☆24Updated last year
- ☆38Updated 4 months ago
- Conduct consumer interviews with synthetic focus groups using LLMs and LangChain☆44Updated last year
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆58Updated 2 weeks ago
- KMD is a collection of conversational exchanges between patients and doctors on various medical topics. It aims to capture the intricaci…☆23Updated 10 months ago
- OpenMindedChatbot is a Proof Of Concept that leverages the power of Open source Large Language Models (LLM) with Function Calling capabil…☆26Updated 9 months ago
- Evaluating LLMs with CommonGen-Lite☆83Updated 6 months ago
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments☆49Updated 3 weeks ago
- ☆75Updated 7 months ago
- ☆71Updated 3 months ago
- Simple Graph Memory for AI applications☆76Updated last month
- Small and Efficient Mathematical Reasoning LLMs☆69Updated 7 months ago
- Explore the use of DSPy for extracting features from PDFs 🔎☆24Updated 6 months ago
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated 6 months ago
- ☆35Updated last year
- LLM reads a paper and produce a working prototype☆19Updated this week
- Source code for our paper: "SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals".☆62Updated 2 months ago
- Comparing retrieval abilities from GPT4-Turbo and a RAG system on a toy example for various context lengths☆35Updated 9 months ago
- A streamlit app for visualizing LLM evals.☆38Updated 8 months ago