target-benchmark / target
TARGET is a benchmark for evaluating Table Retrieval for Generative Tasks such as Fact Verification and Text-to-SQL
☆12Updated this week
Related projects ⓘ
Alternatives and complementary repositories for target
- ☆40Updated this week
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆48Updated 4 months ago
- Data preparation code for CrystalCoder 7B LLM☆42Updated 6 months ago
- Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).☆76Updated 7 months ago
- ☆31Updated 2 weeks ago
- Using open source LLMs to build synthetic datasets for direct preference optimization☆40Updated 8 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆46Updated 2 months ago
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆68Updated 3 weeks ago
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆22Updated 8 months ago
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Models☆20Updated 9 months ago
- Repository for the paper Stream of Search: Learning to Search in Language☆84Updated 3 months ago
- ☆25Updated last month
- Genetics for Language Models☆11Updated 4 months ago
- code for training & evaluating Contextual Document Embedding models☆93Updated this week
- Code for EMNLP 2024 paper "Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning"☆45Updated last month
- Codebase accompanying the Summary of a Haystack paper.☆72Updated last month
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated 8 months ago
- The repository contains generative AI analytics platform application code.☆22Updated 2 weeks ago
- 🔔🧠 Easily experiment with popular language agents across diverse reasoning/decision-making benchmarks!☆44Updated this week
- Backtracing: Retrieving the Cause of the Query, EACL 2024 Long Paper, Findings.☆87Updated 3 months ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆119Updated 3 weeks ago
- Set of scripts to finetune LLMs☆36Updated 7 months ago
- Using multiple LLMs for ensemble Forecasting☆16Updated 9 months ago
- Simple examples using Argilla tools to build AI☆38Updated last week
- ☆24Updated last year
- Functional Benchmarks and the Reasoning Gap☆78Updated last month
- ☆31Updated 4 months ago
- ReBase: Training Task Experts through Retrieval Based Distillation☆27Updated 3 months ago
- Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafte…☆53Updated 2 weeks ago
- ☆41Updated last month
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆29Updated 6 months ago