dohliam / more-stoplistsLinks
stoplists for African languages generated from the ASP corpus
☆14Updated 9 years ago
Alternatives and similar repositories for more-stoplists
Users that are interested in more-stoplists are comparing it to the libraries listed below
Sorting:
- SerendipSlim is a visualization tool for exploring topic models built on large collections of text documents.☆39Updated 7 years ago
- List of (possible) English hedge words☆49Updated 3 years ago
- An intelligent reading agent that understands text and translates it into Wikidata statements.☆116Updated 9 years ago
- Topic Words in Context (TWiC) is a highly-interactive, browser-based visualization for MALLET topic models☆51Updated 8 years ago
- Formula to find the grade level according to the (revised) Dale–Chall Readability Formula (1995)☆31Updated 3 years ago
- An offline/online field database which adapts to its user's terminology and I-Language. http://fielddb.github.io☆80Updated this week
- Lexicons for the Multilingual UCREL Semantic Analysis System☆47Updated this week
- An implementation of latent Dirichlet allocation in javascript☆185Updated 3 years ago
- Command-line tool to extract a ranked list of relevant keywords from a corpus with the option of using either topic modeling or tf-idf sc…☆40Updated 8 years ago
- Wikidata embedding☆51Updated last year
- System for building, visualizing, and working with LDA topic models☆97Updated this week
- command-line tool to extract taxonomies from Wikidata☆129Updated 6 years ago
- Wiktionary parser tool for many language editions.☆54Updated 3 years ago
- linguistics backend☆42Updated 2 years ago
- FoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.g…☆113Updated 10 months ago
- bigram / trigram analysis of wikipedia; mainly mutual info☆22Updated 13 years ago
- Python package for stylometry☆64Updated 4 years ago
- A modular annotation system that supports complex, interactive annotation graphs embedded on top of sequences of text.☆98Updated 3 years ago
- The Art of Literary Text Analysis☆168Updated 6 years ago
- a python package for cleaning Gutenberg books and dataset☆34Updated 7 months ago
- Topic Modeling Workflow in Python☆16Updated 2 years ago
- Web hub based on Wikidata☆38Updated last week
- This repository contains tool and collections dataset for detecting off-topic pages from Web archived collections.☆18Updated 10 years ago
- An online annotation platform for teaching and learning in the humanities.☆108Updated 3 weeks ago
- List of easy American-English words: The New Dale-Chall (1995)☆32Updated 3 years ago
- A visual timeline authoring tool that extracts temporal information from freeform text☆65Updated 2 years ago
- A simple interface to the Project Gutenberg corpus.☆17Updated 9 years ago
- Manifests of the public domain images uploaded to Flickr Commons, with descriptive information about the books they were taken from.☆75Updated 11 years ago
- Scripts to create git repositories for ALTO XML texts, like those from the British Library's scanned documents.☆31Updated 8 years ago
- Tools for tracking stories on news homepages☆48Updated 6 years ago