bigscience-workshop / data_sourcingLinks
This directory gathers the tools developed by the Data Sourcing Working Group
☆31Updated 4 years ago
Alternatives and similar repositories for data_sourcing
Users that are interested in data_sourcing are comparing it to the libraries listed below
Sorting:
- Accelerated NLP pipelines for fast inference on CPU. Built with Transformers and ONNX runtime.☆127Updated 4 years ago
- TimeLMs: Diachronic Language Models from Twitter☆111Updated last year
- Asent is a python library for performing efficient and transparent sentiment analysis using spaCy.☆120Updated last week
- Hashformers is a framework for hashtag segmentation with Transformers and Large Language Models (LLMs).☆75Updated last year
- A collection of scripts to preprocess ASR datasets and finetune language-specific Wav2Vec2 XLSR models☆31Updated 4 years ago
- Scripts to convert datasets from various sources to Hugging Face Datasets.☆57Updated 3 years ago
- Viewer for the 🤗 datasets library.☆85Updated 4 years ago
- ☆87Updated 3 years ago
- ☆113Updated last week
- Sentence transformers models for SpaCy☆107Updated 2 years ago
- Transcribing audio files using Hugging Face's implementation of Wav2Vec2 + "chain-linking" NLP tasks to combine speech-to-text with downs…☆32Updated 4 years ago
- A library to synthesize text datasets using Large Language Models (LLM)☆151Updated 2 years ago
- A PyTorch Lightning Callback for pushing models to the Hugging Face Hub 🤗⚡️☆35Updated 3 years ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.☆156Updated last year
- Explainable Zero-Shot Topic Extraction☆63Updated last year
- Dataset for Emotion Recognition Research☆216Updated 2 years ago
- A minimal template for creating a pypi package☆49Updated 4 years ago
- A PyPI package for easy text annotation in a Jupyter Notebook.☆28Updated 4 years ago
- Information extraction from English and German texts based on predicate logic☆139Updated 2 years ago
- A monolingual and cross-lingual meta-embedding generation and evaluation framework☆80Updated 3 years ago
- A tiny BERT for low-resource monolingual models☆31Updated 3 weeks ago
- A comprehensive tool for linguistic analysis of communities☆49Updated 4 years ago
- 💫 SpaCy wrapper for ConceptNet 💫☆95Updated 2 years ago
- Few-shot Named Entity Recognition☆123Updated 3 years ago
- This is the second part of the Deep Learning Course for the Master in High-Performance Computing (SISSA/ICTP).)☆33Updated 5 years ago
- Alternate Implementation for Zero Shot Text Classification: Instead of reframing NLI/XNLI, this reframes the text backbone of CLIP models…☆37Updated 3 years ago
- Comprehensive NLP Evaluation System☆187Updated last year
- Execute arbitrary SQL queries on 🤗 Datasets☆32Updated last year
- Hinglish Text Classification☆30Updated 2 years ago
- Code and models used in "MUSS Multilingual Unsupervised Sentence Simplification by Mining Paraphrases".☆99Updated 2 years ago