Grasia / wiki-scriptsLinks
Miscellaneous scripts to gather and process data of wikis.
☆20Updated 2 years ago
Alternatives and similar repositories for wiki-scripts
Users that are interested in wiki-scripts are comparing it to the libraries listed below
Sorting:
- TopicScan: Visualization and validation interface for NMF Topic Modeling☆23Updated 5 years ago
- Experiments to help discussion on Wikipedia talk pages☆68Updated this week
- Calculate readability scores☆43Updated 6 years ago
- Notebooks and data associated to constructing and exploring a map of subreddits.☆55Updated 8 years ago
- German lemmatization with IWNLP as extension for spaCy☆26Updated 2 years ago
- An example of how to use spaCy for extremely large files without running into memory issues☆36Updated 3 years ago
- Tokenizer for Twitter and Reddit data☆45Updated 6 years ago
- A Multilingual Latent Dirichlet Allocation (LDA) Pipeline with Stop Words Removal, n-gram features, and Inverse Stemming, in Python.☆83Updated last year
- Cleans Reddit Text Data☆84Updated 5 years ago
- A clean and easy interface for performing nearest-neighbor lookups☆50Updated 6 years ago
- Regex like pattern tree matching but on sentence's tree instead of Strings☆42Updated 7 years ago
- Bag of, not words, but tricks!☆68Updated 2 years ago
- Clean personally identifiable information from dirty dirty text using spaCy.☆41Updated 2 years ago
- Build intelligent data-driven applications with minimal effort. Sentence Clustering, Topics Extraction, Text Similarity, Opinion Summariz…☆41Updated 6 years ago
- A PyPI package for easy text annotation in a Jupyter Notebook.☆29Updated 4 years ago
- This is the text partitioner project for Python.☆21Updated 7 years ago
- A thin wrapper around the DBpedia Spotlight HTTP API☆25Updated 8 years ago
- Collaborative web framework for analyzing text (e.g., tweets). Supports standard labeling and pairwise comparison.☆14Updated 4 years ago
- Notebooks configured to be run with Binder, usually found on my blog.☆42Updated 2 years ago
- Easy-to-use text representations extraction library based on the Transformers library.☆32Updated 3 years ago
- Dataframe Integration with spaCy.☆103Updated 4 years ago
- Training Temporal Word Embeddings with a Compass☆65Updated 5 months ago
- Python package for stylometry☆64Updated 4 years ago
- 💙 Emoji handling and meta data for spaCy with custom extension attributes☆183Updated 2 years ago
- Toolkit to compile a comparable/parallel corpus from European Parliament proceedings☆16Updated 6 years ago
- Compare accuracies of udpipe models and spacy models which can be used for NLP annotation☆14Updated 7 years ago
- ☆70Updated 3 years ago
- A Super-Lightweight Annotation Tool for Experts: Label text in a terminal with just Python☆112Updated last month
- Use spaCy for NLP and output to the FoLiA XML format.☆12Updated last year
- Interpretable data visualizations for understanding how texts differ at the word level☆286Updated 11 months ago