iNeil77 / vllm-code-harnessLinks

Run code inference-only benchmarks quickly using vLLM

☆10

Alternatives and similar repositories for vllm-code-harness

Users that are interested in vllm-code-harness are comparing it to the libraries listed below

Sorting:

allenai / bff
☆38Updated last year
machelreid / m2d2
M2D2: A Massively Multi-domain Language Modeling Dataset (EMNLP 2022) by Machel Reid, Victor Zhong, Suchin Gururangan, Luke Zettlemoyer
☆54Updated 2 years ago
LEYADEV / Vocabulary-Transfer
Implementation of the paper "Fine-Tuning Transformers: Vocabulary Transfer" https://arxiv.org/pdf/2112.14569.pdf
☆20Updated 3 years ago
anthonywchen / MOCHA
Code & data for EMNLP 2020 paper "MOCHA: A Dataset for Training and Evaluating Reading Comprehension Metrics".
☆16Updated 3 years ago
martiansideofthemoon / longeval-summarization
Official repository for our EACL 2023 paper "LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization" (https…
☆44Updated 10 months ago
google-research / dialog-inpainting
☆97Updated 2 years ago
neubig / coderx
A highly sophisticated sequence-to-sequence model for code generation
☆40Updated 3 years ago
suzgunmirac / crowd-sampling
Follow the Wisdom of the Crowd: Effective Text Generation via Minimum Bayes Risk Decoding
☆18Updated 2 years ago
babylm / evaluation-pipeline-2024
The evaluation pipeline for the 2024 BabyLM Challenge.
☆31Updated 7 months ago
EleutherAI / semantic-memorization
☆44Updated 7 months ago
bigscience-workshop / multilingual-modeling
BLOOM+1: Adapting BLOOM model to support a new unseen language
☆72Updated last year
Cohere-Labs-Community / language-confusion
Repository for the "Understanding and Mitigating Language Confusion in LLMs" paper
☆26Updated last year
google-deepmind / streamingqa
☆48Updated last year
bminixhofer / zett
Code for Zero-Shot Tokenizer Transfer
☆133Updated 5 months ago
seonghyeonye / Flipped-Learning
[ICLR 2023] Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners
☆116Updated 9 months ago
SLAB-NLP / Multi-Prompt-LLM-Evaluation
State of What Art? A Call for Multi-Prompt LLM Evaluation
☆15Updated 11 months ago
GEM-benchmark / GEM-metrics
Automatic metrics for GEM tasks
☆66Updated 2 years ago
tau-nlp / scrolls
The official code of EMNLP 2022, "SCROLLS: Standardized CompaRison Over Long Language Sequences".
☆70Updated last year
inspired-cognition / critique-apps
Apps built using Inspired Cognition's Critique.
☆58Updated 2 years ago
Kaleidophon / awesome-experimental-standards-deep-learning
Repository collecting resources and best practices to improve experimental rigour in deep learning research.
☆27Updated 2 years ago
nyu-mll / SQuALITY
Query-focused summarization data
☆42Updated 2 years ago
shayne-longpre / a-pretrainers-guide
☆72Updated 2 years ago
aviaefrat / lmentry
☆12Updated last year
adapter-hub / hgiyt
Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"
☆27Updated 3 years ago
xnancy / russ
☆16Updated 4 years ago
ekinakyurek / influence
Code for "Tracing Knowledge in Language Models Back to the Training Data"
☆38Updated 2 years ago
hitz-zentroa / lm-contamination
The LM Contamination Index is a manually created database of contamination evidences for LMs.
☆78Updated last year
frankxu2004 / knnlm-why
Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"
☆58Updated 2 years ago
kernelmachine / demix
DEMix Layers for Modular Language Modeling
☆53Updated 3 years ago
AlexWan0 / infini-gram
An unofficial implementation of the Infini-gram model proposed by Liu et al. (2024)
☆33Updated last year