google / lmevalLinks
☆222Updated last month
Alternatives and similar repositories for lmeval
Users that are interested in lmeval are comparing it to the libraries listed below
Sorting:
- Ranking LLMs on agentic tasks☆176Updated 2 weeks ago
- A Lightweight Library for AI Observability☆249Updated 5 months ago
- Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research☆217Updated this week
- Simple UI for debugging correlations of text embeddings☆288Updated 2 months ago
- Tutorial for building LLM router☆221Updated last year
- 🤗 Benchmark Large Language Models Reliably On Your Data☆367Updated this week
- Beating the GAIA benchmark with Transformers Agents. 🚀☆131Updated 5 months ago
- ☆77Updated 6 months ago
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆111Updated 3 months ago
- ☆155Updated 3 months ago
- XTR/WARP (SIGIR'25) is an extremely fast and accurate retrieval engine based on Stanford's ColBERTv2/PLAID and Google DeepMind's XTR.☆152Updated 3 months ago
- Build datasets using natural language☆505Updated 2 months ago
- Train your own SOTA deductive reasoning model☆103Updated 4 months ago
- ☆73Updated 5 months ago
- Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.☆222Updated 3 weeks ago
- Research repository on interfacing LLMs with Weaviate APIs. Inspired by the Berkeley Gorilla LLM.☆133Updated last month
- Solving data for LLMs - Create quality synthetic datasets!☆150Updated 6 months ago
- Source code for the collaborative reasoner research project at Meta FAIR.☆99Updated 3 months ago
- ☆78Updated 9 months ago
- Code that accompanies the public release of the paper Lost in Conversation (https://arxiv.org/abs/2505.06120)☆148Updated last month
- Banishing LLM Hallucinations Requires Rethinking Generalization☆276Updated last year
- ☆118Updated 11 months ago
- ☆93Updated 4 months ago
- Official page for ICLR 2025 paper "Sufficient Context: A New Lens on Retrieval Augmented Generation Systems"☆46Updated 2 weeks ago
- ☆145Updated last year
- Rank LLMs, RAG systems, and prompts using automated head-to-head evaluation☆105Updated 7 months ago
- ☆128Updated 3 months ago
- ☆96Updated 10 months ago
- Client Code Examples, Use Cases and Benchmarks for Enterprise h2oGPTe RAG-Based GenAI Platform☆87Updated last month
- Code for Husky, an open-source language agent that solves complex, multi-step reasoning tasks. Husky v1 addresses numerical, tabular and …☆345Updated last year