KamWithK / PyParquetLoaders
Easy, efficient and Pythonic data loading of Parquet files for PyTorch-based libraries
☆22Updated 3 years ago
Related projects: ⓘ
- ☆32Updated 2 years ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆42Updated 10 months ago
- Bi-encoder entity linking architecture☆40Updated last week
- ☆92Updated last year
- Implementation of TableFormer, Robust Transformer Modeling for Table-Text Encoding, in Pytorch☆35Updated 2 years ago
- Transformers at any scale☆39Updated 8 months ago
- ☆20Updated 3 years ago
- ☆18Updated 2 years ago
- Implementation of COCO-LM, Correcting and Contrasting Text Sequences for Language Model Pretraining, in Pytorch☆45Updated 3 years ago
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning P…☆33Updated last year
- Multi-task modelling extensions for huggingface transformers☆17Updated last year
- RATransformers 🐭- Make your transformer (like BERT, RoBERTa, GPT-2 and T5) Relation Aware!☆41Updated last year
- My explorations into editing the knowledge and memories of an attention network☆34Updated last year
- 🛠️ Tools for Transformers compression using PyTorch Lightning ⚡☆77Updated 6 months ago
- Index of URLs to pdf files all over the internet and scripts☆20Updated last year
- Repository for Multimodal AutoML Benchmark☆60Updated 2 years ago
- ☆91Updated this week
- Using FlexAttention to compute attention with different masking patterns☆28Updated last week
- This is the official PyTorch repo for "UNIREX: A Unified Learning Framework for Language Model Rationale Extraction" (ICML 2022).☆23Updated last year
- Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)☆58Updated 2 years ago
- A python library for highly configurable transformers - easing model architecture search and experimentation.☆50Updated 2 years ago
- Utilities for Training Very Large Models☆56Updated last week
- Code for NeurIPS LLM Efficiency Challenge☆52Updated 5 months ago
- Official code and model checkpoints for our EMNLP 2022 paper "RankGen - Improving Text Generation with Large Ranking Models" (https://arx…☆136Updated last year
- High performance pytorch modules☆18Updated last year
- Retrieval as Attention☆77Updated last year
- Implementation of some personal helper functions for Einops, my most favorite tensor manipulation library ❤️☆52Updated last year
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆91Updated last year
- ☆37Updated last month
- Truly flash T5 realization!☆48Updated 4 months ago