dmahan93 / lm-evaluation-harnessLinks
A framework for few-shot evaluation of autoregressive language models.
☆16Updated 2 years ago
Alternatives and similar repositories for lm-evaluation-harness
Users that are interested in lm-evaluation-harness are comparing it to the libraries listed below
Sorting:
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆50Updated last year
- Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first app…☆169Updated last year
- Using multiple LLMs for ensemble Forecasting☆16Updated last year
- A data-centric AI package for ML/AI. Get the best high-quality data for the best results. Discord: https://discord.gg/t6ADqBKrdZ☆63Updated last year
- Track the progress of LLM context utilisation☆54Updated 7 months ago
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆106Updated last month
- ☆33Updated 2 years ago
- ☆73Updated last year
- ☆20Updated last year
- ☆86Updated last year
- 💙 Unstructured Data Connectors for Haystack 2.0☆17Updated 2 years ago
- ☆57Updated 2 years ago
- Finetune Falcon, LLaMA, MPT, and RedPajama on consumer hardware using PEFT LoRA☆103Updated 5 months ago
- Dataset Viber is your chill repo for data collection, annotation and vibe checks.☆46Updated last year
- Mixing Language Models with Self-Verification and Meta-Verification☆109Updated 11 months ago
- Conduct consumer interviews with synthetic focus groups using LLMs and LangChain☆43Updated 2 years ago
- Text to Python Objects via a LLM Function Call☆58Updated last year
- HuggingChat like UI in Gradio☆70Updated 2 years ago
- Comparing retrieval abilities from GPT4-Turbo and a RAG system on a toy example for various context lengths☆35Updated last year
- A collection of notebooks for the Hugging Face blog series (https://huggingface.co/blog).☆45Updated last year
- Explore the use of DSPy for extracting features from PDFs 🔎☆48Updated last year
- Writing Blog Posts with Generative Feedback Loops!☆50Updated last year
- ☆23Updated 2 years ago
- The Next Generation Multi-Modality Superintelligence☆69Updated last year
- auto fine tune of models with synthetic data☆75Updated last year
- Simple examples using Argilla tools to build AI☆56Updated 11 months ago
- Data Questionnaire Agent Chatbot☆69Updated last month
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡