bigscience-workshop / data_sourcingLinks
This directory gathers the tools developed by the Data Sourcing Working Group
☆31Updated 3 years ago
Alternatives and similar repositories for data_sourcing
Users that are interested in data_sourcing are comparing it to the libraries listed below
Sorting:
- Scripts to convert datasets from various sources to Hugging Face Datasets.☆57Updated 2 years ago
- A PyPI package for easy text annotation in a Jupyter Notebook.☆28Updated 3 years ago
- spaCy match and replace, maintaining conjugation☆35Updated 2 years ago
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆63Updated last year
- classy is a simple-to-use library for building high-performance Machine Learning models in NLP.☆87Updated 2 months ago
- 🤗 Push your spaCy pipelines to the Hugging Face Hub☆44Updated last year
- Just another sentiment wrapper.☆17Updated 3 years ago
- Minimal code to train ELMo models in recent versions of TensorFlow☆14Updated 2 years ago
- ☆23Updated last year
- Transcribing audio files using Hugging Face's implementation of Wav2Vec2 + "chain-linking" NLP tasks to combine speech-to-text with downs…☆32Updated 4 years ago
- BERT and ELECTRA models trained on Europeana Newspapers☆38Updated 3 years ago
- A minimal template for creating a pypi package☆49Updated 4 years ago
- fastai ulmfit - Pretraining the Language Model, Fine-Tuning and training a Classifier☆33Updated 3 years ago
- Hinglish Text Classification☆30Updated last year
- A Python library aimed at dissecting and augmenting NER training data.☆58Updated 2 years ago
- 💫 SpaCy wrapper for ConceptNet 💫☆93Updated last year
- Visualise, evaluate, and manage annotated data☆33Updated 2 years ago
- ☆30Updated 3 years ago
- Bag of, not words, but tricks!☆68Updated last year
- ☆19Updated 3 years ago
- Generate reports for spaCy models.☆29Updated 3 years ago
- This is the second part of the Deep Learning Course for the Master in High-Performance Computing (SISSA/ICTP).)☆33Updated 4 years ago
- A spaCy custom component that extracts and normalizes temporal expressions☆54Updated 2 years ago
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Do…☆80Updated 11 months ago
- Code and models for our CLEF-HIPE (Named Entity Processing on Historical Newspapers) submissions☆19Updated 2 years ago
- REMERGE - Multi-Word Expression discovery algorithm☆14Updated 2 years ago
- Explainable Zero-Shot Topic Extraction☆62Updated 9 months ago
- ☆87Updated 3 years ago
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality …☆106Updated last year
- A gold-standard dataset of software mentions in research publications.☆36Updated last year