google / lmevalLinks
☆235Updated 3 weeks ago
Alternatives and similar repositories for lmeval
Users that are interested in lmeval are comparing it to the libraries listed below
Sorting:
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆246Updated this week
- Simple UI for debugging correlations of text embeddings☆305Updated 6 months ago
- Ranking LLMs on agentic tasks☆204Updated last month
- Benchmark and optimize LLM inference across frameworks with ease☆150Updated 3 months ago
- A Lightweight Library for AI Observability☆252Updated 10 months ago
- Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research☆272Updated last week
- A clean, modular SDK for building AI agents with OpenHands V1.☆360Updated this week
- A simple tool that let's you explore different possible paths that an LLM might sample.☆196Updated 7 months ago
- Routing on Random Forest (RoRF)☆235Updated last year
- Training setup for Langchain's Open Deep Research☆73Updated 3 months ago
- Tutorial for building LLM router☆239Updated last year
- Super basic implementation (gist-like) of RLMs with REPL environments.☆286Updated 2 months ago
- Collection of scripts and notebooks for OpenAI's latest GPT OSS models☆485Updated 4 months ago
- Code to accompany the Universal Deep Research paper (https://arxiv.org/abs/2509.00244)☆451Updated 3 months ago
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆115Updated 8 months ago
- Source code for the collaborative reasoner research project at Meta FAIR.☆111Updated 8 months ago
- 🤗 Benchmark Large Language Models Reliably On Your Data☆419Updated last week
- ☆159Updated 8 months ago
- ☆79Updated 2 months ago
- Research repository on interfacing LLMs with Weaviate APIs. Inspired by the Berkeley Gorilla LLM.☆139Updated 4 months ago
- Codebase for FinePDFs☆156Updated last month
- ☆686Updated this week
- Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.☆254Updated 4 months ago
- Real-Time Detection of Hallucinated Entities in Long-Form Generation☆273Updated last month
- A Tree Search Library with Flexible API for LLM Inference-Time Scaling☆504Updated 2 weeks ago
- ☆81Updated last month
- Train your own SOTA deductive reasoning model☆107Updated 9 months ago
- Complex Function Calling Benchmark.☆159Updated 11 months ago
- Inference, Fine Tuning and many more recipes with Gemma family of models☆276Updated 5 months ago
- ☆57Updated 10 months ago