r2llab / wranglLinks

Parallel data preprocessing for NLP and ML.

☆34

Alternatives and similar repositories for wrangl

Users that are interested in wrangl are comparing it to the libraries listed below

Sorting:

allenai / EmbeddingRecycling
Embedding Recycling for Language models
☆39Updated 2 years ago
krandiash / quinine
A library to create and manage configuration files, especially for machine learning projects.
☆79Updated 3 years ago
zphang / minimal-opt
☆67Updated 2 years ago
HendrikStrobelt / LMdiff
A diff tool for language models
☆43Updated last year
huggingface / olm-training
Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.
☆93Updated 2 years ago
lf1-io / padl
Functional deep learning
☆108Updated 2 years ago
peterbhase / SLAG-Belief-Updating
Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"
☆28Updated 3 years ago
JunShern / few-shot-adaptation
Exploring Few-Shot Adaptation of Language Models with Tables
☆24Updated 2 years ago
RobertCsordas / transformer_generalization
The official repository for our paper "The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers". We s…
☆67Updated 2 years ago
allenai / smashed
SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batchi…
☆33Updated last year
google-research / jestimator
Amos optimizer with JEstimator lib.
☆82Updated last year
AI21Labs / lm-evaluation
Evaluation suite for large-scale language models.
☆127Updated 3 years ago
hadasah / btm
☆75Updated last year
HomebrewML / HomebrewNLP-torch
A case study of efficient training of large language models using commodity hardware.
☆68Updated 3 years ago
JoaoLages / RATransformers
RATransformers 🐭- Make your transformer (like BERT, RoBERTa, GPT-2 and T5) Relation Aware!
☆41Updated 2 years ago
guy-dar / embedding-space
☆54Updated 2 years ago
bloomberg / minilmv2.bb
Our open source implementation of MiniLMv2 (https://aclanthology.org/2021.findings-acl.188)
☆61Updated 2 years ago
facebookresearch / CCQA
CCQA A New Web-Scale Question Answering Dataset for Model Pre-Training
☆32Updated 3 years ago
naver / disco
A Toolkit for Distributional Control of Generative Models
☆73Updated this week
nreimers / se-pytorch-xla
☆21Updated 3 years ago
neubig / coderx
A highly sophisticated sequence-to-sequence model for code generation
☆40Updated 4 years ago
huggingface / tune
☆87Updated 3 years ago
srush / transformers-bet
☆12Updated 3 years ago
frankxu2004 / knnlm-why
Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"
☆58Updated 2 years ago
stas00 / porting
Helper scripts and notes that were used while porting various nlp models
☆45Updated 3 years ago
SeanNaren / minGPT
A minimal PyTorch Lightning OpenAI GPT w DeepSpeed Training!
☆112Updated 2 years ago
google-research / precondition
☆31Updated last month
google-research / t5x_retrieval
☆100Updated 2 years ago
nreimers / flax-sentence-embeddings
Shared code for training sentence embeddings with Flax / JAX
☆27Updated 4 years ago
yifding / hetseq
HetSeq: Distributed GPU Training on Heterogeneous Infrastructure
☆106Updated 2 years ago