huggingface / yourbenchLinks
π€ Benchmark Large Language Models Reliably On Your Data
β426Updated last month
Alternatives and similar repositories for yourbench
Users that are interested in yourbench are comparing it to the libraries listed below
Sorting:
- Build datasets using natural languageβ566Updated 4 months ago
- awesome synthetic (text) datasetsβ321Updated 3 weeks ago
- Simple UI for debugging correlations of text embeddingsβ305Updated 8 months ago
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. π¨π»βπ³β352Updated 8 months ago
- Collection of scripts and notebooks for OpenAI's latest GPT OSS modelsβ496Updated 5 months ago
- Banishing LLM Hallucinations Requires Rethinking Generalizationβ277Updated last year
- An Open Source Toolkit For LLM Distillationβ859Updated last month
- Attribute (or cite) statements generated by LLMs back to in-context information.β319Updated last year
- Automatically evaluate your LLMs in Google Colabβ685Updated last year
- β162Updated last year
- A Lightweight Library for AI Observabilityβ255Updated 11 months ago
- β141Updated 5 months ago
- A small library of LLM judgesβ319Updated 6 months ago
- Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.β259Updated 2 weeks ago
- β237Updated 2 months ago
- An open-source tool for LLM prompt optimization.β759Updated last week
- β120Updated last year
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language Mβ¦β249Updated last year
- Late Interaction Models Training & Retrievalβ694Updated 3 weeks ago
- Fast Multimodal Semantic Deduplication & Filteringβ882Updated 2 weeks ago
- Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on taskβ¦β184Updated last year
- Let's build better datasets, together!β269Updated last year
- β696Updated 9 months ago
- π Automatically annotate papers using LLMsβ401Updated 2 months ago
- Code for explaining and evaluating late chunking (chunked pooling)β487Updated last year
- A compact LLM pretrained in 9 days by using high quality dataβ339Updated 9 months ago
- Toolkit for attaching, training, saving and loading of new heads for transformer modelsβ294Updated 11 months ago
- β159Updated 9 months ago
- [ACL'25] Official Code for LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to Small-Scale Local LLMsβ314Updated 6 months ago
- TapeAgents is a framework that facilitates all stages of the LLM Agent development lifecycleβ302Updated last month