chroma-core / generative-benchmarkingLinks

☆23

Alternatives and similar repositories for generative-benchmarking

Users that are interested in generative-benchmarking are comparing it to the libraries listed below

Sorting:

writer / writing-in-the-margins
☆118Updated 9 months ago
cognitivecomputations / spectrum
☆121Updated last month
davanstrien / haiku-dpo
Using open source LLMs to build synthetic datasets for direct preference optimization
☆63Updated last year
MadryLab / context-cite
Attribute (or cite) statements generated by LLMs back to in-context information.
☆235Updated 7 months ago
MoritzLaurer / synthetic-data-blog
This is the reproduction repository for my 🤗 Hugging Face blog post on synthetic data
☆68Updated last year
Knowledgator / GLiClass
Generalist and Lightweight Model for Text Classification
☆128Updated 2 weeks ago
allenai / infinigram-api
☆58Updated 3 weeks ago
zetaalphavector / RAGElo
RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker
☆111Updated 2 weeks ago
huggingface / yourbench
🤗 Benchmark Large Language Models Reliably On Your Data
☆318Updated last week
Mihaiii / backtrack_sampler
An easy-to-understand framework for LLM samplers that rewind and revise generated tokens
☆139Updated 3 months ago
Pleias / Pleias-RAG-Library
Python library to use Pleias-RAG models
☆53Updated last month
LAGoM-NLP / transtokenizer
☆45Updated 4 months ago
Muhtasham / summarization-eval
📝 Reference-Free automatic summarization evaluation with potential hallucination detection
☆100Updated last year
facebookresearch / collaborative-reasoner
Source code for the collaborative reasoner research project at Meta FAIR.
☆87Updated last month
jina-ai / correlations
Simple UI for debugging correlations of text embeddings
☆256Updated last week
Mihaiii / llm_steer
Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vecto…
☆239Updated 3 months ago
orionw / promptriever
The first dense retrieval model that can be prompted like an LM
☆73Updated 3 weeks ago
tonywu71 / colpali-cookbooks
Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. 👨🏻‍🍳
☆292Updated last week
Hannibal046 / nanoColBERT
Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).
☆80Updated last year
automix-llm / automix
Mixing Language Models with Self-Verification and Meta-Verification
☆104Updated 5 months ago
OpenPipe / deductive-reasoning
Train your own SOTA deductive reasoning model
☆93Updated 3 months ago
deshwalmahesh / PHUDGE
Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…
☆49Updated 10 months ago
redotvideo / pluto
Synthetic Data for LLM Fine-Tuning
☆118Updated last year
apple / ml-superposition-prompting
☆143Updated 10 months ago
menloresearch / ReZero
☆145Updated last month
Pleias / Various-Finetuning
Set of scripts to finetune LLMs
☆37Updated last year
Arize-ai / LLMTest_NeedleInAHaystack
Doing simple retrieval from LLM models at various context lengths to measure accuracy
☆99Updated last year
louisbrulenaudet / ragoon
High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡
☆66Updated 7 months ago
bethgelab / CiteME
CiteME is a benchmark designed to test the abilities of language models in finding papers that are cited in scientific texts.
☆45Updated 7 months ago
casper-hansen / OpenCoconut
OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.
☆172Updated 4 months ago