google / lmevalLinks
☆231Updated 2 months ago
Alternatives and similar repositories for lmeval
Users that are interested in lmeval are comparing it to the libraries listed below
Sorting:
- Ranking LLMs on agentic tasks☆182Updated last week
- Simple UI for debugging correlations of text embeddings☆291Updated 3 months ago
- A Lightweight Library for AI Observability☆251Updated 6 months ago
- Tutorial for building LLM router☆226Updated last year
- Routing on Random Forest (RoRF)☆203Updated 11 months ago
- 🤗 Benchmark Large Language Models Reliably On Your Data☆391Updated last week
- ☆155Updated 4 months ago
- Train your own SOTA deductive reasoning model☆106Updated 6 months ago
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆111Updated 5 months ago
- Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research☆225Updated this week
- Research repository on interfacing LLMs with Weaviate APIs. Inspired by the Berkeley Gorilla LLM.☆135Updated 3 weeks ago
- ☆81Updated 10 months ago
- Beating the GAIA benchmark with Transformers Agents. 🚀☆136Updated 6 months ago
- ☆76Updated 8 months ago
- Source code for the collaborative reasoner research project at Meta FAIR.☆102Updated 4 months ago
- Collection of scripts and notebooks for OpenAI's latest GPT OSS models☆437Updated 3 weeks ago
- Build datasets using natural language☆527Updated 4 months ago
- ☆76Updated 6 months ago
- Together Open Deep Research☆344Updated 4 months ago
- Code to accompany the Universal Deep Research paper (https://arxiv.org/abs/2509.00244)☆370Updated 2 weeks ago
- ☆262Updated 2 months ago
- Training setup for Langchain's Open Deep Research☆46Updated 2 weeks ago
- Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafte…☆78Updated 10 months ago
- Code that accompanies the public release of the paper Lost in Conversation (https://arxiv.org/abs/2505.06120)☆158Updated 2 months ago
- Train embedding and reranker models for retrieval tasks on Apple Silicon with MLX☆153Updated this week
- Maya: An Instruction Finetuned Multilingual Multimodal Model using Aya☆116Updated last month
- Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.☆236Updated last month
- ☆135Updated 3 weeks ago
- A simple tool that let's you explore different possible paths that an LLM might sample.☆186Updated 4 months ago
- XTR/WARP (SIGIR'25) is an extremely fast and accurate retrieval engine based on Stanford's ColBERTv2/PLAID and Google DeepMind's XTR.☆163Updated 4 months ago