AYLIEN / news-signals-datasets
Creating time-indexed datasets with clusters of texts as inputs and timeseries as targets.
☆19Updated this week
Alternatives and similar repositories for news-signals-datasets:
Users that are interested in news-signals-datasets are comparing it to the libraries listed below
- A Python library aimed at dissecting and augmenting NER training data.☆58Updated last year
- Generalist and Lightweight Model for Text Classification☆110Updated this week
- Versatile framework designed to streamline the integration of your models, as well as those sourced from Hugging Face, into complex progr…☆28Updated 3 months ago
- Pre-train Static Word Embeddings☆51Updated 3 weeks ago
- Robust and fast topic models with sentence-transformers.☆48Updated 2 weeks ago
- StAtutory Reasoning Assessment☆13Updated 2 years ago
- 💫 SpaCy wrapper for ConceptNet 💫☆90Updated last year
- Source code and data for Like a Good Nearest Neighbor☆28Updated 2 months ago
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.☆107Updated 10 months ago
- SPRINT Toolkit helps you evaluate diverse neural sparse models easily using a single click on any IR dataset.☆45Updated last year
- XAI based human-in-the-loop framework for automatic rule-learning.☆48Updated 8 months ago
- KeypartX is a graph-based approach to represent perception (text in general) by key parts of speech.Updated last year
- A RAG that can scale 🧑🏻💻☆11Updated 10 months ago
- GraphER: A Structure-aware Text-to-Graph Model for Entity and Relation Extraction☆68Updated 8 months ago
- RaKUn 2.0 - A fast keyword detection algorithm☆66Updated last month
- Search through Facebook Research's PyTorch BigGraph Wikidata-dataset with the Weaviate vector search engine☆31Updated 3 years ago
- The NewSHead dataset is a multi-doc headline dataset used in NHNet for training a headline summarization model.☆37Updated 3 years ago
- 🤗 HuggingFace Inference Toolkit for Google Cloud Vertex AI (similar to SageMaker's Inference Toolkit, but for Vertex AI and unofficial)☆17Updated last year
- ☆43Updated last year
- Implementation of the paper "Deep Indexed Active Learning for Matching Heterogeneous Entity Representations"☆16Updated 3 years ago
- LEMON: Explainable Entity Matching☆18Updated 2 years ago
- spaCy match and replace, maintaining conjugation☆35Updated 2 years ago
- A spaCy custom component that extracts and normalizes temporal expressions☆54Updated 2 years ago
- Ranking of fine-tuned HF models as base models.☆35Updated last year
- A simple library for training named entity recognition model from partially annotated data☆23Updated last year
- Vespa application making an index of the CORD-19 dataset.☆39Updated 2 months ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆47Updated last year
- Efficient few-shot learning with cross-encoders.☆50Updated last year
- 🕸️ A graph-augmented dense statute retriever. (EACL 2023)☆21Updated last year
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆22Updated 2 years ago