aymeric-roucher / benchmark_agentsLinks
☆27Updated last year
Alternatives and similar repositories for benchmark_agents
Users that are interested in benchmark_agents are comparing it to the libraries listed below
Sorting:
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 10 months ago
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆77Updated 7 months ago
- ☆54Updated 3 months ago
- ☆38Updated 10 months ago
- Testing speed and accuracy of RAG with, and without Cross Encoder Reranker.☆48Updated last year
- ARAGOG- Advanced RAG Output Grading. Exploring and comparing various Retrieval-Augmented Generation (RAG) techniques on AI research paper…☆104Updated last year
- Ready-to-go containerized RAG service. Implemented with text-embedding-inference + Qdrant/LanceDB.☆65Updated 5 months ago
- Explore the use of DSPy for extracting features from PDFs 🔎☆40Updated last year
- ☆16Updated last year
- Codebase accompanying the Summary of a Haystack paper.☆78Updated 8 months ago
- Various installation guides for Large Language Models☆69Updated last month
- Reward Model framework for LLM RLHF☆61Updated last year
- LLM reads a paper and produce a working prototype☆57Updated last month
- GenAI Experimentation☆57Updated last month
- 🔧 Compare how Agent systems perform on several benchmarks. 📊🚀☆97Updated 7 months ago
- Initiative to evaluate and rank the most popular LLMs across common task types based on their propensity to hallucinate.☆110Updated 8 months ago
- ☆20Updated last year
- Using open source LLMs to build synthetic datasets for direct preference optimization☆63Updated last year
- ☆46Updated 8 months ago
- ☆92Updated 2 months ago
- Writing Blog Posts with Generative Feedback Loops!☆48Updated last year
- Synthetic Text Dataset Generation for LLM projects☆28Updated last week
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer☆43Updated last year
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆33Updated 3 weeks ago
- ☆45Updated last year
- PyTorch implementation for MRL☆18Updated last year
- ☆42Updated last year
- Chunk your text using gpt4o-mini more accurately☆44Updated 10 months ago
- ☆49Updated 7 months ago
- Examples of RAG using LangChain with local LLMs - Mixtral 8x7B, Llama 2, Mistral 7B, Orca 2, Phi-2, Neural 7B☆39Updated last year