aymeric-roucher / benchmark_agents
β27Updated last year
Alternatives and similar repositories for benchmark_agents:
Users that are interested in benchmark_agents are comparing it to the libraries listed below
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute β¦β49Updated 9 months ago
- Lightweight demos for finetuning LLMs. Powered by π€ transformers and open-source datasets.β76Updated 6 months ago
- β48Updated 5 months ago
- β16Updated last year
- β24Updated last year
- SCREWS: A Modular Framework for Reasoning with Revisionsβ27Updated last year
- Using open source LLMs to build synthetic datasets for direct preference optimizationβ61Updated last year
- PyTorch implementation for MRLβ18Updated last year
- β52Updated 2 months ago
- Improving Text Embedding of Language Models Using Contrastive Fine-tuningβ63Updated 8 months ago
- Reward Model framework for LLM RLHFβ61Updated last year
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorerβ42Updated last year
- Question Answer Generation App using Mistral 7B, Langchain, and FastAPI.β65Updated last year
- β20Updated last year
- β84Updated last year
- GenAI Experimentationβ58Updated this week
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing β‘β66Updated 5 months ago
- Evaluating LLMs with CommonGen-Liteβ89Updated last year
- Examples of RAG using LangChain with local LLMs - Mixtral 8x7B, Llama 2, Mistral 7B, Orca 2, Phi-2, Neural 7Bβ38Updated last year
- Testing speed and accuracy of RAG with, and without Cross Encoder Reranker.β48Updated last year
- β75Updated last year
- Chunk your text using gpt4o-mini more accuratelyβ44Updated 8 months ago
- EMNLP 2024 "Re-reading improves reasoning in large language models". Simply repeating the question to get bidirectional understanding forβ¦β25Updated 4 months ago
- Scripts, notebooks, and articles about data science in general.β47Updated last year
- β20Updated 3 years ago
- A collection of hand on notebook for LLMs practitionerβ47Updated 3 months ago
- ARAGOG- Advanced RAG Output Grading. Exploring and comparing various Retrieval-Augmented Generation (RAG) techniques on AI research paperβ¦β101Updated last year
- Mixing Language Models with Self-Verification and Meta-Verificationβ104Updated 4 months ago
- Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Modelsβ69Updated last year
- Set of scripts to finetune LLMsβ37Updated last year