Arize-ai / LLMTest_NeedleInAHaystack
Doing simple retrieval from LLM models at various context lengths to measure accuracy
β99Updated 11 months ago
Alternatives and similar repositories for LLMTest_NeedleInAHaystack:
Users that are interested in LLMTest_NeedleInAHaystack are comparing it to the libraries listed below
- Just a bunch of benchmark logs for different LLMsβ119Updated 7 months ago
- π Reference-Free automatic summarization evaluation with potential hallucination detectionβ100Updated last year
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.β82Updated last year
- A framework for evaluating function calls made by LLMsβ37Updated 8 months ago
- β76Updated 9 months ago
- Mixing Language Models with Self-Verification and Meta-Verificationβ102Updated 3 months ago
- Track the progress of LLM context utilisationβ53Updated 8 months ago
- Evaluating LLMs with CommonGen-Liteβ89Updated last year
- Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAIβ224Updated 10 months ago
- Writing Blog Posts with Generative Feedback Loops!β47Updated last year
- Google Deepmind's PromptBreeder for automated prompt engineering implemented in langchain expression language.β98Updated 7 months ago
- β150Updated 3 months ago
- β48Updated last year
- β92Updated last year
- Simple examples using Argilla tools to build AIβ53Updated 4 months ago
- Logging and caching superpowers for the openai sdkβ103Updated last year
- Building a chatbot powered with a RAG pipeline to read,summarize and quote the most relevant papers related to the user query.β166Updated 10 months ago
- Testing speed and accuracy of RAG with, and without Cross Encoder Reranker.β48Updated last year
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive argumentsβ75Updated 5 months ago
- Using various instructor clients evaluating the quality and capabilities of extractions and reasoning.β50Updated 5 months ago
- Using open source LLMs to build synthetic datasets for direct preference optimizationβ59Updated last year
- Codebase accompanying the Summary of a Haystack paper.β75Updated 6 months ago
- β73Updated last year
- β142Updated 8 months ago
- Build a Streamlit Chatbot using Langchain, ColBERT, Ragatouille, and ChromaDBβ119Updated last year
- β87Updated last year
- Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vectoβ¦β229Updated last month
- Function Calling Benchmark & Testingβ84Updated 8 months ago
- β77Updated 9 months ago