BramVanroy / bicorpus-preprocessingLinks

☆9

Alternatives and similar repositories for bicorpus-preprocessing

Users that are interested in bicorpus-preprocessing are comparing it to the libraries listed below

Sorting:

babylonhealth / hmrb
☆70Updated 2 years ago
pmbaumgartner / spacy-setfit-textcat
☆30Updated 3 years ago
LightTag / ALMa
ALMa (Active Learning Manager) Keeps track of labeled and unlabeled data for active learning
☆41Updated 5 years ago
BramVanroy / spacy-extreme
An example of how to use spaCy for extremely large files without running into memory issues
☆36Updated 2 years ago
oscar-project / goclassy
An asynchronous concurrent pipeline for classifying Common Crawl based on fastText's pipeline.
☆86Updated 4 years ago
writer / replaCy
spaCy match and replace, maintaining conjugation
☆35Updated 2 years ago
thomasthiebaud / spacy-fastlang
Language detection using Spacy and Fasttext
☆57Updated last year
argilla-io / biome-text
Custom Natural Language Processing with big and small models 🌲🌱
☆68Updated 3 years ago
SiphuLangeni / tortus
A PyPI package for easy text annotation in a Jupyter Notebook.
☆28Updated 3 years ago
Abhijit-2592 / spacy-langdetect
A fully customisable language detection pipeline for spaCy
☆93Updated 6 years ago
gnes-ai / hub
GNES Hub ship AI/ML models as Docker containers and use Docker containers as plugins.
☆34Updated 5 years ago
huggingface / neuralcoref-viz
✨ Web interface for NeuralCoref coreference resolution
☆35Updated 2 years ago
denocris / MHPC-Natural-Language-Processing-Lectures
This is the second part of the Deep Learning Course for the Master in High-Performance Computing (SISSA/ICTP).)
☆33Updated 4 years ago
MilaNLProc / bertlang
A web interface to understand language-specific BERT-models
☆18Updated last year
litus-ai / classy
classy is a simple-to-use library for building high-performance Machine Learning models in NLP.
☆87Updated 3 months ago
koursaros-ai / microservices
Neural Elastic Inference and Search
☆19Updated 5 years ago
bjascob / pyInflect
A python module for word inflections designed for use with spaCy.
☆92Updated 5 years ago
HazyResearch / reef
Automatically labeling training data
☆107Updated 6 years ago
HazyResearch / fonduer-tutorials
A collection of simple tutorials for using Fonduer
☆100Updated 4 years ago
stephantul / unitoken
Tokenization across languages. Useful as preprocessing for subword tokenization.
☆22Updated 2 years ago
indix / whatthelang
Lightning Fast Language Prediction 🚀
☆167Updated 6 years ago
koaning / spacy-report
Generate reports for spaCy models.
☆29Updated 3 years ago
joelgrus / streamlit-allennlp
allennlp + streamlit demo
☆22Updated 5 years ago
autosoft-dev / code-bert
Automatically check mismatch between code and comments using AI and ML
☆53Updated 4 years ago
koaning / tokenwiser
Bag of, not words, but tricks!
☆68Updated last year
falcony-io / ml-annotate
Use ML-Annotate to label data for machine learning purposes
☆109Updated 5 years ago
jobergum / dense-vector-ranking-performance
Performance evaluation of nearest neighbor search using Vespa, Elasticsearch and Open Distro for Elasticsearch K-NN
☆117Updated 4 years ago
kabirkhan / recon
Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality …
☆106Updated last year
kororo / excelcy
Excel Integration with spaCy. Training NER using Excel/XLSX from PDF, DOCX, PPT, PNG or JPG.
☆105Updated 2 years ago
explosion / vscode-prodigy
🧬 A VS Code extension for annotating data with Prodigy
☆30Updated 3 years ago