LeonEricsson / llmjudge
Exploring limitations of LLM-as-a-judge
☆14Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for llmjudge
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated 8 months ago
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆23Updated 8 months ago
- ☆32Updated last year
- Comparing retrieval abilities from GPT4-Turbo and a RAG system on a toy example for various context lengths☆35Updated 11 months ago
- PyTorch implementation for MRL☆18Updated 9 months ago
- QLoRA for Masked Language Modeling☆20Updated last year
- Embedding Recycling for Language models☆38Updated last year
- MEXMA: Token-level objectives improve sentence representations☆34Updated 2 weeks ago
- ☆27Updated 5 months ago
- ☆24Updated last year
- ReBase: Training Task Experts through Retrieval Based Distillation☆27Updated 4 months ago
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Models☆20Updated 9 months ago
- Using short models to classify long texts☆20Updated last year
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆68Updated last month
- Learning to Retrieve by Trying - Source code for Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval☆24Updated 3 weeks ago
- Code for NeurIPS LLM Efficiency Challenge☆54Updated 7 months ago
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning P…☆34Updated last year
- A library for squeakily cleaning and filtering language datasets.☆45Updated last year
- ☆41Updated 2 weeks ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆48Updated 4 months ago
- Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and te…☆42Updated 10 months ago
- ☆46Updated this week
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer☆37Updated 7 months ago
- ☆37Updated last year
- Using open source LLMs to build synthetic datasets for direct preference optimization☆40Updated 8 months ago
- Using multiple LLMs for ensemble Forecasting☆16Updated 10 months ago
- DPO, but faster 🚀☆23Updated 3 weeks ago
- Codebase accompanying the Summary of a Haystack paper.☆72Updated 2 months ago
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆32Updated this week