Shark-NLP / CABLinks

☆31

Alternatives and similar repositories for CAB

Users that are interested in CAB are comparing it to the libraries listed below

Sorting:

suzgunmirac / crowd-sampling
Follow the Wisdom of the Crowd: Effective Text Generation via Minimum Bayes Risk Decoding
☆18Updated 2 years ago
princeton-nlp / LM-Kernel-FT
A Kernel-Based View of Language Model Fine-Tuning https://arxiv.org/abs/2210.05643
☆78Updated last year
sustcsonglin / disco-pointer
Official Implementation of ACL2023: Don't Parse, Choose Spans! Continuous and Discontinuous Constituency Parsing via Autoregressive Span …
☆13Updated last year
Hunter-DDM / stablemoe
Code for the ACL-2022 paper "StableMoE: Stable Routing Strategy for Mixture of Experts"
☆48Updated 3 years ago
tau-nlp / scrolls
The official code of EMNLP 2022, "SCROLLS: Standardized CompaRison Over Long Language Sequences".
☆70Updated last year
kernelmachine / demix
DEMix Layers for Modular Language Modeling
☆53Updated 3 years ago
swj0419 / in-context-pretraining
☆53Updated last year
McGill-NLP / polytropon
☆54Updated 2 years ago
RUCAIBox / ELMER
This repository is the official implementation of our EMNLP 2022 paper ELMER: A Non-Autoregressive Pre-trained Language Model for Efficie…
☆26Updated 2 years ago
emorynlp / seq2seq-corenlp
☆13Updated 2 years ago
hemingkx / SpecDec
Codes for our paper "Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation" (EMNLP 2023 Findings)
☆42Updated last year
princeton-nlp / DinkyTrain
Princeton NLP's pre-training library based on fairseq with DeepSpeed kernel integration 🚃
☆114Updated 2 years ago
gmftbyGMFTBY / Rep-Dropout
[NeurIPS 2023] Repetition In Repetition Out: Towards Understanding Neural Text Degeneration from the Data Perspective
☆33Updated last year
chijames / KERPLE
☆19Updated 2 years ago
da03 / criticize_text_generation
A method for evaluating the high-level coherence of machine-generated texts. Identifies high-level coherence issues in transformer-based …
☆11Updated 2 years ago
mega002 / ff-layers
The accompanying code for "Transformer Feed-Forward Layers Are Key-Value Memories". Mor Geva, Roei Schuster, Jonathan Berant, and Omer Le…
☆94Updated 3 years ago
nkandpa2 / long_tail_knowledge
Repo for the paper "Large Language Models Struggle to Learn Long-Tail Knowledge"
☆77Updated 2 years ago
ptlmasking / maskbert
☆20Updated 4 years ago
rosewang2008 / language_modeling_via_stochastic_processes
Language modeling via stochastic processes. Oral @ ICLR 2022.
☆138Updated 2 years ago
HazyResearch / skill-it
Skill-It! A Data-Driven Skills Framework for Understanding and Training Language Models
☆46Updated last year
jzbjyb / ReAtt
Retrieval as Attention
☆83Updated 2 years ago
ictnlp / FA-DAT
Official Implementation for the ICLR2023 paper "Fuzzy Alignments in Directed Acyclic Graph for Non-autoregressive Machine Translation"
☆13Updated 2 years ago
HazyResearch / prefix-linear-attention
☆55Updated last year
sail-sg / symbolic-instruction-tuning
The official repository for the paper "From Zero to Hero: Examining the Power of Symbolic Tasks in Instruction Tuning".
☆65Updated 2 years ago
violet-zct / swarm-distillation-zero-shot
☆22Updated 2 years ago
berlino / btg-seq2seq
☆12Updated 2 years ago
chenllliang / ParetoMNMT
Source code for paper "On the Pareto Front of Multilingual Neural Machine Translation" @ NeurIPS 2023
☆16Updated last year
KwanWaiChung / M4LE
Code for M4LE: A Multi-Ability Multi-Range Multi-Task Multi-Domain Long-Context Evaluation Benchmark for Large Language Models
☆23Updated last year
llyx97 / TAMT
[NAACL 2022] "Learning to Win Lottery Tickets in BERT Transfer via Task-agnostic Mask Training", Yuanxin Liu, Fandong Meng, Zheng Lin, Pe…
☆15Updated 2 years ago
bigscience-workshop / architecture-objective
☆97Updated 2 years ago