garg-ankush / scipe
SCIPE is a powerful tool for evaluating and diagnosing LLM (Large Language Model) graphs or chains.
☆16Updated 2 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for scipe
- A seamless matchmaking application that is programmed with Cohere Command R+, Stanford NLP DSPy framework, Weaviate Vector store and Crew…☆58Updated 7 months ago
- Leveraging DSPy for AI-driven task understanding and solution generation, the Self-Discover Framework automates problem-solving through r…☆57Updated 4 months ago
- Writing Blog Posts with Generative Feedback Loops!☆43Updated 8 months ago
- Example code using the DSPy framework.☆18Updated 5 months ago
- A specification for OpenInference, a semantic mapping of ML inferences☆45Updated 7 months ago
- Self-host LLMs with vLLM and BentoML☆74Updated last week
- Dynamic Metadata based RAG Framework☆71Updated 3 months ago
- Experimental Code for StructuredRAG: Structured Outputs in Retrieval-Augmented Generation☆94Updated this week
- ☆18Updated this week
- Creating Generative AI Apps which work☆16Updated 4 months ago
- Streamlit app for recommending eval functions using prompt diffs☆25Updated 10 months ago
- Leverage your LangChain trace data for fine tuning☆38Updated 3 months ago
- A Ruby on Rails style framework for the DSPy (Demonstrate, Search, Predict) project for Language Models like GPT, BERT, and LLama.☆111Updated last month
- ☆57Updated last year
- Natural Language Interfaces Powered by LLMs☆91Updated 3 months ago
- Generate Tools and Toolkits from any Python SDK -- no extra code required☆49Updated 2 weeks ago
- ☆31Updated 8 months ago
- Using various instructor clients evaluating the quality and capabilities of extractions and reasoning.☆47Updated last month
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments☆63Updated last month
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆48Updated 4 months ago
- Simple Graph Memory for AI applications☆79Updated 3 months ago
- DSPy program/pipeline inspector widget for Jupyter/VSCode Notebooks.☆28Updated 9 months ago
- This repo is the central repo for all the RAG Evaluation reference material and partner workshop☆50Updated last month
- Testing speed and accuracy of RAG with, and without Cross Encoder Reranker.☆47Updated 10 months ago
- Verbosity control for AI agents☆59Updated 5 months ago
- ☆37Updated 11 months ago
- Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)☆74Updated 2 months ago
- A framework for evaluating function calls made by LLMs☆35Updated 3 months ago
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆97Updated 7 months ago