AYLIEN / news-signals-datasets
Creating time-indexed datasets with clusters of texts as inputs and timeseries as targets.
☆16Updated 3 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for news-signals-datasets
- Versatile framework designed to streamline the integration of your models, as well as those sourced from Hugging Face, into complex progr…☆23Updated 3 months ago
- A Python library aimed at dissecting and augmenting NER training data.☆57Updated last year
- KeypartX is a graph-based approach to represent perception (text in general) by key parts of speech.☆0Updated last year
- Vespa application making an index of the CORD-19 dataset.☆39Updated this week
- Generalist and Lightweight Model for Text Classification☆51Updated last week
- Summary Explorer is a tool to visually explore the state-of-the-art in text summarization.☆43Updated 6 months ago
- SPRINT Toolkit helps you evaluate diverse neural sparse models easily using a single click on any IR dataset.☆42Updated last year
- Source code and data for Like a Good Nearest Neighbor☆28Updated 9 months ago
- RaKUn 2.0 - A fast keyword detection algorithm☆65Updated 3 months ago
- A spaCy custom component that extracts and normalizes temporal expressions☆52Updated last year
- Data Programming by Demonstration (DPBD) for Document Classification☆35Updated 3 years ago
- ZS4IE: A Toolkit for Zero-Shot Information Extraction with Simple Verbalizations☆26Updated 2 years ago
- A set of methods for finding an appropriate number of topics in a text collection☆14Updated 3 months ago
- INCOME: An Easy Repository for Training and Evaluation of Index Compression Methods in Dense Retrieval. Includes BPR and JPQ.☆22Updated last year
- ☆45Updated 2 years ago
- This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' pu…☆40Updated 2 years ago
- No Parameter Left Behind: How Distillation and Model Size Affect Zero-Shot Retrieval☆27Updated 2 years ago
- Starbucks: Improved Training for 2D Matryoshka Embeddings☆17Updated last month
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.☆103Updated 6 months ago
- 💫 SpaCy wrapper for ConceptNet 💫☆88Updated last year
- ☆37Updated last week
- The codebase for our ACL2023 paper: Did You Read the Instructions? Rethinking the Effectiveness of Task Definitions in Instruction Learni…☆27Updated last year
- GLADIS: A General and Large Acronym Disambiguation Benchmark (EACL 23)☆13Updated 5 months ago
- One-stop shop for running and fine-tuning transformer-based language models for retrieval☆31Updated this week
- Semantically Structured Sentence Embeddings☆67Updated last month
- ☆42Updated last year
- Code release for Type-Aware Bi-Encoders for Open-Domain Entity Retrieval☆19Updated 2 years ago
- Explainable Zero-Shot Topic Extraction☆61Updated 3 months ago
- The NewSHead dataset is a multi-doc headline dataset used in NHNet for training a headline summarization model.☆36Updated 2 years ago
- A multi-lingual approach to AllenNLP CoReference Resolution along with a wrapper for spaCy.☆103Updated 7 months ago