allenai / catwalkLinks

This project studies the performance and robustness of language models and task-adaptation methods.

☆150

Alternatives and similar repositories for catwalk

Users that are interested in catwalk are comparing it to the libraries listed below

Sorting:

allenai / peS2o
Pretraining Efficiently on S2ORC!
☆165Updated 9 months ago
allenai / wimbd
What's In My Big Data (WIMBD) - a toolkit for analyzing large text datasets
☆222Updated 8 months ago
bigscience-workshop / lm-evaluation-harness
A framework for few-shot evaluation of autoregressive language models.
☆105Updated 2 years ago
facebookresearch / dpr-scale
Scalable training for dense retrieval models.
☆299Updated last month
chaitanyamalaviya / ExpertQA
[Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers
☆131Updated last year
nyu-mll / quality
☆138Updated 6 months ago
TIGER-AI-Lab / MAmmoTH2
Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]
☆146Updated 9 months ago
seonghyeonye / Flipped-Learning
[ICLR 2023] Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners
☆116Updated last month
salesforce / factualNLG
Code for the arXiv paper: "LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond"
☆59Updated 6 months ago
sileod / tasksource
Datasets collection and preprocessings framework for NLP extreme multitask learning
☆185Updated 3 weeks ago
huggingface / datablations
Scaling Data-Constrained Language Models
☆338Updated last month
dwzhu-pku / LongEmbed
LongEmbed: Extending Embedding Models for Long Context Retrieval (EMNLP 2024)
☆140Updated 8 months ago
google-research / true
Code and data accompanying the paper "TRUE: Re-evaluating Factual Consistency Evaluation".
☆81Updated 2 weeks ago
allenai / bff
☆39Updated last year
facebookresearch / Shepherd
This is the repo for the paper Shepherd -- A Critic for Language Model Generation
☆219Updated last year
p-lambda / dsir
DSIR large-scale data selection framework for language model training
☆257Updated last year
yizhongw / Tk-Instruct
Tk-Instruct is a Transformer model that is tuned to solve many NLP tasks by following instructions.
☆181Updated 2 years ago
allenai / Lila
A unified benchmark for math reasoning
☆88Updated 2 years ago
kaistAI / CoT-Collection
[EMNLP 2023] The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning
☆245Updated last year
DaoD / INTERS
This is the repository for our paper "INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning"
☆204Updated 7 months ago
shayne-longpre / a-pretrainers-guide
☆72Updated 2 years ago
veronica320 / Faithful-COT
Code and data accompanying our paper on arXiv "Faithful Chain-of-Thought Reasoning".
☆162Updated last year
facebookresearch / tart
Code and model release for the paper "Task-aware Retrieval with Instructions" by Asai et al.
☆163Updated last year
allenai / WildBench
Benchmarking LLMs with Challenging Tasks from Real Users
☆233Updated 8 months ago
LAION-AI / Open-Instruction-Generalist
Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks
☆208Updated last year
IBM / SALMON
Self-Alignment with Principle-Following Reward Models
☆162Updated 2 months ago
nkandpa2 / long_tail_knowledge
Repo for the paper "Large Language Models Struggle to Learn Long-Tail Knowledge"
☆77Updated 2 years ago
jakespringer / echo-embeddings
☆152Updated last year
seonghyeonye / TAPP
[AAAI 2024] Investigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction Following
☆79Updated 10 months ago
orhonovich / unnatural-instructions
☆180Updated 2 years ago