leogao2 / commoncrawl_downloader
☆33Updated last year
Alternatives and similar repositories for commoncrawl_downloader:
Users that are interested in commoncrawl_downloader are comparing it to the libraries listed below
- ☆89Updated 2 years ago
- ☆77Updated last year
- ☆50Updated 2 years ago
- ☆148Updated 3 years ago
- Code for Relevance-guided Supervision for OpenQA with ColBERT (TACL'21)☆41Updated 3 years ago
- Evaluation suite for large-scale language models.☆124Updated 3 years ago
- The corresponding code for our paper: "Exploring the Challenges of Open Domain Multi-Document Summarization". Do not hesitate to open an …☆32Updated last year
- Implementation of Marge, Pre-training via Paraphrasing, in Pytorch☆75Updated 4 years ago
- Helper scripts and notes that were used while porting various nlp models☆45Updated 2 years ago
- Script for downloading GitHub.☆90Updated 7 months ago
- ☆93Updated 2 months ago
- ☆97Updated 2 years ago
- No Parameter Left Behind: How Distillation and Model Size Affect Zero-Shot Retrieval☆28Updated 2 years ago
- Source codes for the paper "Bounding the Capabilities of Large Language Models in Open Text Generation with Prompt Constraints"☆27Updated 2 years ago
- A package for fine-tuning Transformers with TPUs, written in Tensorflow2.0+☆37Updated 3 years ago
- ☆16Updated 2 years ago
- ☆44Updated 3 months ago
- Python tools for processing the stackexchange data dumps into a text dataset for Language Models☆81Updated last year
- QED: A Framework and Dataset for Explanations in Question Answering☆115Updated 3 years ago
- [ICLR 2023] Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners☆113Updated 5 months ago
- The NewSHead dataset is a multi-doc headline dataset used in NHNet for training a headline summarization model.☆37Updated 3 years ago
- ☆110Updated 2 years ago
- A library for squeakily cleaning and filtering language datasets.☆46Updated last year
- Prompt tuning toolkit for GPT-2 and GPT-Neo☆88Updated 3 years ago
- Create soft prompts for fairseq 13B dense, GPT-J-6B and GPT-Neo-2.7B for free in a Google Colab TPU instance☆27Updated last year
- One stop shop for all things carp☆59Updated 2 years ago
- Training a model without a dataset for natural language inference (NLI)☆25Updated 4 years ago
- Summary Explorer is a tool to visually explore the state-of-the-art in text summarization.☆44Updated 9 months ago
- ☆45Updated last year
- Tools for managing datasets for governance and training.☆82Updated 2 weeks ago