bigscience-workshop / data_sourcing
This directory gathers the tools developed by the Data Sourcing Working Group
β31Updated 3 years ago
Alternatives and similar repositories for data_sourcing:
Users that are interested in data_sourcing are comparing it to the libraries listed below
- Scripts to convert datasets from various sources to Hugging Face Datasets.β57Updated 2 years ago
- A PyTorch Lightning Callback for pushing models to the Hugging Face Hub π€β‘οΈβ36Updated 2 years ago
- TorchServe+Streamlit for easily serving your HuggingFace NER modelsβ33Updated 2 years ago
- A PyPI package for easy text annotation in a Jupyter Notebook.β28Updated 3 years ago
- β28Updated last year
- This is the second part of the Deep Learning Course for the Master in High-Performance Computing (SISSA/ICTP).)β33Updated 4 years ago
- Using short models to classify long textsβ21Updated 2 years ago
- classy is a simple-to-use library for building high-performance Machine Learning models in NLP.β86Updated 2 months ago
- β87Updated 2 years ago
- Training a model without a dataset for natural language inference (NLI)β25Updated 4 years ago
- NLP Examples using the π€ librariesβ41Updated 4 years ago
- fastai ulmfit - Pretraining the Language Model, Fine-Tuning and training a Classifierβ33Updated 3 years ago
- Multilingual Emotion classification using BERT (fine-tuning). Published at the WASSA workshop (ACL2022).β8Updated 2 years ago
- Companion Repo for the Vision Language Modelling YouTube series - https://bit.ly/3PsbsC2 - by Prithivi Da. Open to PRs and collaborationsβ14Updated 2 years ago
- Tools for managing datasets for governance and training.β83Updated last month
- β30Updated 3 years ago
- Explainable Zero-Shot Topic Extractionβ62Updated 7 months ago
- π₯ Use Hugging Face text and token classification pipelines directly in spaCyβ63Updated last year
- Minimal code to train ELMo models in recent versions of TensorFlowβ14Updated last year
- Build fast gradio demos of fastai learnersβ35Updated 3 years ago
- This will hold the data pipeline to convert raw audio data to speech which will act as input dataset for speech-to-text pipelineβ32Updated 2 years ago
- spaCy match and replace, maintaining conjugationβ35Updated 2 years ago
- π€ Push your spaCy pipelines to the Hugging Face Hubβ43Updated 10 months ago
- All my experiments with the various transformers and various transformer frameworks availableβ14Updated 3 years ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.β153Updated 10 months ago
- NLP tool to extract emotional phrase from tweets π€©β40Updated 3 years ago
- β21Updated 2 months ago
- MoodCatπΌ classifies the mood of English sentences.β14Updated 2 years ago
- [WIP] Behold, semantic-search, built over sentence-transformers to make it easy for search engineers to evaluate, optimise and deploy modβ¦β15Updated last year
- Download and load spaCy models on-the-flyβ15Updated 2 years ago