r2llab / wrangl
Parallel data preprocessing for NLP and ML.
☆33Updated last week
Related projects ⓘ
Alternatives and complementary repositories for wrangl
- A library to create and manage configuration files, especially for machine learning projects.☆77Updated 2 years ago
- Embedding Recycling for Language models☆38Updated last year
- ☆67Updated 2 years ago
- Exploring Few-Shot Adaptation of Language Models with Tables☆23Updated 2 years ago
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆92Updated last year
- The official repository for our paper "The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers". We s…☆66Updated last year
- ☆50Updated last year
- SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batchi…☆31Updated 5 months ago
- ☆16Updated last year
- My explorations into editing the knowledge and memories of an attention network☆34Updated last year
- A collection of Models, Datasets, DataModules, Callbacks, Metrics, Losses and Loggers to better integrate pytorch-lightning with transfor…☆47Updated last year
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning P…☆34Updated last year
- Codes and files for the paper Are Emergent Abilities in Large Language Models just In-Context Learning☆34Updated 7 months ago
- One-stop shop for running and fine-tuning transformer-based language models for retrieval☆27Updated this week
- A diff tool for language models☆42Updated 10 months ago
- Amos optimizer with JEstimator lib.☆80Updated 5 months ago
- Functional deep learning☆106Updated last year
- Few-shot Learning with Auxiliary Data☆26Updated 11 months ago
- ☆12Updated 2 years ago
- This repository contains example code to build models on TPUs☆30Updated last year
- 🛠️ Tools for Transformers compression using PyTorch Lightning ⚡☆79Updated this week
- Ludwig benchmark☆19Updated 2 years ago
- This project shows how to derive the total number of training tokens from a large text dataset from 🤗 datasets with Apache Beam and Data…☆23Updated 2 years ago
- ☆71Updated 6 months ago
- A case study of efficient training of large language models using commodity hardware.☆68Updated 2 years ago
- Ranking of fine-tuned HF models as base models.☆35Updated last year
- Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"☆28Updated 2 years ago
- interactive explorer for language models☆9Updated 5 years ago