cardiffnlp / timelms
TimeLMs: Diachronic Language Models from Twitter
β107Updated 11 months ago
Alternatives and similar repositories for timelms:
Users that are interested in timelms are comparing it to the libraries listed below
- Google's BigBird (Jax/Flax & PyTorch) @ π€Transformersβ48Updated last year
- A python package for benchmarking interpretability techniques on Transformers.β213Updated 4 months ago
- β74Updated 3 years ago
- Seahorse is a dataset for multilingual, multi-faceted summarization evaluation. It consists of 96K summaries with human ratings along 6 qβ¦β86Updated 11 months ago
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.β93Updated 2 years ago
- Mr. TyDi is a multi-lingual benchmark dataset built on TyDi, covering eleven typologically diverse languages.β74Updated 2 years ago
- Collection of NLP model explanations and accompanying analysis toolsβ145Updated last year
- A python package to run inference with HuggingFace language and vision-language checkpoints wrapping many convenient features.β26Updated 5 months ago
- This repository contains materials for the SIGIR 2022 tutorial on opinion summarization.β34Updated 2 years ago
- A monolingual and cross-lingual meta-embedding generation and evaluation frameworkβ80Updated 2 years ago
- Multi-task modelling extensions for huggingface transformersβ20Updated last year
- This repository contains the code for "Generating Datasets with Pretrained Language Models".β187Updated 3 years ago
- β155Updated 7 months ago
- Creating class-based TF-IDF matricesβ82Updated 2 years ago
- A curated list of awesome datasets with human label variation (un-aggregated labels) in Natural Language Processing and Computer Vision, β¦β80Updated 10 months ago
- Code & Data for Comparative Opinion Summarization via Collaborative Decoding (Iso et al; Findings of ACL 2022)β21Updated last year
- As good as new. How to successfully recycle English GPT-2 to make models for other languages (ACL Findings 2021)β48Updated 3 years ago
- This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' puβ¦β40Updated 3 years ago
- Contrastive Fact Verificationβ71Updated 2 years ago
- No Parameter Left Behind: How Distillation and Model Size Affect Zero-Shot Retrievalβ28Updated 2 years ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.β151Updated 8 months ago
- Apps built using Inspired Cognition's Critique.β58Updated last year
- The official code for PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarizationβ156Updated 2 years ago
- A library for parameter-efficient and composable transfer learning for NLP with sparse fine-tunings.β71Updated 6 months ago
- Code for our WOAH@ACL 2021 Paper on Data Integration for Toxic Comment Classification: Making More Than 40 Datasets Easily Accessible in β¦β27Updated 3 years ago
- BLOOM+1: Adapting BLOOM model to support a new unseen languageβ70Updated 11 months ago
- [DEPRECATED] Adapt Transformer-based language models to new text domainsβ86Updated 11 months ago
- An instruction-based benchmark for text improvements.β140Updated 2 years ago
- The NewSHead dataset is a multi-doc headline dataset used in NHNet for training a headline summarization model.β37Updated 3 years ago
- A multilingual version of MS MARCO passage ranking datasetβ143Updated last year