YugantM / textcleaner
text-data pre-processing utility
☆13Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for textcleaner
- Text processing library for sentiment analysis and related tasks☆27Updated 6 years ago
- Experimental library for sampling and validating scikit-learn parameters☆10Updated 5 years ago
- [WIP] Behold, semantic-search, built over sentence-transformers to make it easy for search engineers to evaluate, optimise and deploy mod…☆15Updated last year
- Text classification automl☆21Updated 3 years ago
- An example of how to use spaCy for extremely large files without running into memory issues☆36Updated 2 years ago
- Simple and clean Python implementation of TextRank as per seminal paper by Rada Mihalcea and Paul Tarau. This implementation performs bot…☆11Updated 3 years ago
- Aho-Corasick string replacement utility☆23Updated 4 years ago
- sequence tagging with spaCy and crfsuite☆18Updated last year
- Text preprocessing tools in python.☆26Updated 6 years ago
- BERT Probe: A python package for probing attention based robustness to character and word based adversarial evaluation. Also, with recipe…☆18Updated 2 years ago
- Cross-lingual Fact-to-Text Alignment and Generation for Low-Resource Languages☆9Updated last year
- ☆13Updated 3 years ago
- This is a custom library for data processing, visualization and machine learning tools.☆13Updated 8 months ago
- ☆15Updated 5 years ago
- ☆31Updated 5 years ago
- Deploy Pytorch models to production via panini☆10Updated 5 years ago
- A streamlit component to embed Disqus in your applications.☆11Updated 3 years ago
- Companion Repo for the Vision Language Modelling YouTube series - https://bit.ly/3PsbsC2 - by Prithivi Da. Open to PRs and collaborations☆14Updated 2 years ago
- Language Modelling, CMI vs Perplexity☆11Updated 6 years ago
- How to do data science with Optimus, Spark and Python.☆18Updated 5 years ago
- ☆16Updated 4 years ago
- ☆15Updated 3 years ago
- 🧬 A VS Code extension for annotating data with Prodigy☆30Updated 2 years ago
- Named entity recognition for the legal domain☆40Updated 3 years ago
- Code and data for Teddy https://arxiv.org/abs/2001.05171.☆15Updated 2 years ago
- A Python package to get useful information from documents using TopicRank Algorithm.☆16Updated last year
- Use pretrained BERT model to automatically generate grammar multiple choice questions (MCQ) from any news article or story.☆13Updated 5 years ago
- Generates the most important key-phrase/key-words from a document based on a corpus☆11Updated 5 months ago
- Language detection using Spacy and Fasttext☆54Updated 11 months ago