TweetCaT - a tool for building Twitter corpora of smaller languages or specific geographical regions
☆12May 18, 2017Updated 8 years ago
Alternatives and similar repositories for tweetcat
Users that are interested in tweetcat are comparing it to the libraries listed below
Sorting:
- Basic dataset for the linguistic data collection.☆15Feb 13, 2017Updated 9 years ago
- generate rules from lists of words☆16Jul 9, 2021Updated 4 years ago
- This is a new backend implementation of the ANNIS linguistic search and visualization system.☆18Jan 13, 2026Updated last month
- A tool for text normalisation via character-level machine translation☆13Jun 12, 2020Updated 5 years ago
- Part of eMOP: the Recursive Text Alignment Tool compares OCR text results to groundtruth by character and computes a score.☆22Sep 24, 2015Updated 10 years ago
- This repository☆30Nov 13, 2022Updated 3 years ago
- What happens when you connect all the ZIP/postal codes in a country in ascending order?☆13Updated this week
- Gazetteer of the Ancient Near East Data☆10Aug 1, 2013Updated 12 years ago
- A PHP library for comparing two or more Sanskrit TEI XML files and generating an apparatus with variants☆14Feb 16, 2026Updated 3 weeks ago
- Comparable documents miner: Arabic-English morphological analysis, text processing, n-gram features extraction, POS tagging, dictionary t…☆35Apr 24, 2017Updated 8 years ago
- finite-state toolkit, EM and Bayesian (Gibbs sampling) training for FST and context-free derivation forests☆41Oct 14, 2022Updated 3 years ago
- TEI-encoded contents of the Egyptian Gazette☆15Jun 11, 2024Updated last year
- A repository for the SRN documents database API☆14Feb 24, 2025Updated last year
- Statistical discontinuous constituent parsing☆11Feb 15, 2018Updated 8 years ago
- Public Comment Analysis Project for the Federal Chief Data Officer Council. The Comment Analysis pilot has shown that a toolset leveragin…☆13Sep 17, 2021Updated 4 years ago
- Arabic Word-Embedding (Word2vec) model training from Wikipedia articles☆11Dec 13, 2018Updated 7 years ago
- Simple CORPORA list crawler☆10Dec 2, 2016Updated 9 years ago
- Oracc GUI☆12Jun 27, 2025Updated 8 months ago
- Project to digitize avant-garde periodicals☆12May 13, 2022Updated 3 years ago
- A Simple Sudoku Solver☆23Nov 26, 2012Updated 13 years ago
- simple geocoder comparison tool☆13Jun 19, 2014Updated 11 years ago
- Library for HTTP request signing (Ruby implementation)☆12Jul 23, 2025Updated 7 months ago
- Web-based page layout editor created for EMOP (Early Modern OCR Project).☆11May 21, 2021Updated 4 years ago
- ☆11Oct 13, 2019Updated 6 years ago
- PyAnnotation is a Python Library to access and manipulate linguistically annotated corpus files.☆17Sep 4, 2012Updated 13 years ago
- Tool for interacting with the reMarkable lines format and API☆13Jul 1, 2021Updated 4 years ago
- A Go package to download Tiktok videos and profile picture of tiktokers.☆10Mar 17, 2022Updated 3 years ago
- natural language processing in the browser - i18n☆10Aug 6, 2015Updated 10 years ago
- Automatically constructed lexical database for Bangla inspired from Wordnet☆11Jul 12, 2012Updated 13 years ago
- An implementation of Horton hash tables☆10Aug 24, 2016Updated 9 years ago
- Implementation in pure Ruby for Microsoft RDP protocol☆10Sep 4, 2016Updated 9 years ago
- ☆10Oct 2, 2017Updated 8 years ago
- ☆10Sep 4, 2015Updated 10 years ago
- Automatic text comparison with an extendable variance classifier☆13Sep 11, 2023Updated 2 years ago
- Building the epistemic web☆13Jul 16, 2024Updated last year
- Working with SAML assertions from Python☆14Sep 1, 2014Updated 11 years ago
- JavaScript Sequence Alignment Viewer☆11Mar 25, 2022Updated 3 years ago
- An implementation of the MinHash algorithm in ruby using Murmur Hash☆26May 8, 2009Updated 16 years ago
- Drizzle ORM adapter for AdminJS☆13Aug 12, 2025Updated 6 months ago