google / lmevalLinks
☆234Updated 4 months ago
Alternatives and similar repositories for lmeval
Users that are interested in lmeval are comparing it to the libraries listed below
Sorting:
- Simple UI for debugging correlations of text embeddings☆299Updated 5 months ago
- Ranking LLMs on agentic tasks☆199Updated 2 months ago
- Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research☆259Updated this week
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆111Updated 7 months ago
- Benchmark and optimize LLM inference across frameworks with ease☆131Updated 2 months ago
- Super basic implementation (gist-like) of RLMs with REPL environments.☆248Updated 3 weeks ago
- Collection of scripts and notebooks for OpenAI's latest GPT OSS models☆477Updated 2 months ago
- Beating the GAIA benchmark with Transformers Agents. 🚀☆138Updated 8 months ago
- ☆158Updated 6 months ago
- 🤗 Benchmark Large Language Models Reliably On Your Data☆410Updated last month
- ☆146Updated last year
- Tutorial for building LLM router☆235Updated last year
- ☆79Updated last month
- Code to accompany the Universal Deep Research paper (https://arxiv.org/abs/2509.00244)☆447Updated 2 months ago
- ☆84Updated last year
- Research repository on interfacing LLMs with Weaviate APIs. Inspired by the Berkeley Gorilla LLM.☆136Updated 2 months ago
- An Automatic Prompt Optimization Framework for Large Language Models☆137Updated 3 months ago
- ☆79Updated last week
- Train your own SOTA deductive reasoning model☆108Updated 8 months ago
- Routing on Random Forest (RoRF)☆220Updated last year
- Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.☆249Updated 3 months ago
- A Lightweight Library for AI Observability☆251Updated 8 months ago
- Training setup for Langchain's Open Deep Research☆70Updated 2 months ago
- Code that accompanies the public release of the paper Lost in Conversation (https://arxiv.org/abs/2505.06120)☆178Updated 4 months ago
- Source code for the collaborative reasoner research project at Meta FAIR.☆105Updated 6 months ago
- Codebase for FinePDFs☆135Updated last week
- ☆138Updated 2 months ago
- Official page for ICLR 2025 paper "Sufficient Context: A New Lens on Retrieval Augmented Generation Systems"☆53Updated 3 months ago
- A Tree Search Library with Flexible API for LLM Inference-Time Scaling☆488Updated this week
- frozen-in-time version of our Paper Finder agent for reproducing evaluation results☆200Updated 2 months ago