BramVanroy / bicorpus-preprocessing
☆9Updated 4 years ago
Alternatives and similar repositories for bicorpus-preprocessing:
Users that are interested in bicorpus-preprocessing are comparing it to the libraries listed below
- An asynchronous concurrent pipeline for classifying Common Crawl based on fastText's pipeline.☆86Updated 4 years ago
- ☆70Updated 2 years ago
- ☆13Updated 3 years ago
- ALMa (Active Learning Manager) Keeps track of labeled and unlabeled data for active learning☆41Updated 4 years ago
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆22Updated 2 years ago
- 🔎 A Prodigy plugin for evaluating spaCy pipelines☆13Updated last year
- spaCy match and replace, maintaining conjugation☆35Updated 2 years ago
- A web interface to understand language-specific BERT-models☆17Updated last year
- Pyinfer is a model agnostic tool for ML developers and researchers to benchmark the inference statistics for machine learning models or f…☆24Updated 4 years ago
- Neural Elastic Inference and Search☆19Updated 5 years ago
- An example of how to use spaCy for extremely large files without running into memory issues☆36Updated 2 years ago
- ☆43Updated 2 years ago
- ☆30Updated 2 years ago
- allennlp + streamlit demo☆22Updated 5 years ago
- Tooling to play around with multilingual machine translation for Indian Languages.☆22Updated 3 years ago
- This is the second part of the Deep Learning Course for the Master in High-Performance Computing (SISSA/ICTP).)☆33Updated 4 years ago
- ☆13Updated 5 years ago
- Extracting narrative timelines (i.e. order and timing of events) from text☆20Updated 6 years ago
- DEPRECATED--all functionality moved to nbdev☆15Updated 2 years ago
- Topic Inference with Zeroshot models☆61Updated last year
- Transformer based Trigram Blocking implementation in Tensorflow☆11Updated 5 years ago
- Bag of, not words, but tricks!☆68Updated last year
- Generic Environment for Context-Aware Correction of Orthography☆22Updated 2 years ago
- ELECTRA MODEL NLP☆13Updated 5 years ago
- 🧬 A VS Code extension for annotating data with Prodigy☆30Updated 3 years ago
- Polyglot skipgram embeddings, and their many health benefits☆12Updated 5 years ago
- This is a prototype of a multi-lingual suite for named-entity recognition in Python.☆21Updated last year
- Code and data accompanying the paper "Approaching nested named entity recognition with parallel LSTM-CRFs."☆26Updated 2 years ago
- classy is a simple-to-use library for building high-performance Machine Learning models in NLP.☆87Updated 2 weeks ago
- Automatically check mismatch between code and comments using AI and ML☆53Updated 3 years ago