puntonim / gutenberg-bulk-downloader
Bulk downloader for free ebooks hosted at Project Gutenberg
☆19Updated 3 years ago
Alternatives and similar repositories for gutenberg-bulk-downloader:
Users that are interested in gutenberg-bulk-downloader are comparing it to the libraries listed below
- Wiktionary parser tool for many language editions.☆54Updated 2 years ago
- Basic dataset for the linguistic data collection.☆15Updated 8 years ago
- python-timbl, originally developed by Sander Canisius, is a Python extension module wrapping the full TiMBL C++ programming interface. Wi…☆18Updated 3 months ago
- Some convenient natural language tools that build on NLTK.☆85Updated 10 years ago
- Recipes for training OpenNMT systems☆14Updated 7 years ago
- A tool for analyzing the word histories of a text.☆34Updated 5 months ago
- Pipeline for distributed Natural Language Processing, made in Python☆64Updated 8 years ago
- Hierarchical phrase-based machine translation system☆32Updated 10 years ago
- A web-based, token-level annotation tool for non-standard language data☆10Updated 4 years ago
- Command-line corpus tools☆9Updated 7 years ago
- Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic pr…☆68Updated 2 months ago
- A powerful, tagset-independent and theory-neutral meta model and API for storing, manipulating, and representing nearly all types of ling…☆15Updated 2 years ago
- A command-line program to download text corpora.☆34Updated 7 years ago
- Python package for harvesting records from OAI-PMH provider(s).☆62Updated 2 years ago
- ☆30Updated 8 years ago
- An intelligent reading agent that understands text and translates it into Wikidata statements.☆115Updated 8 years ago
- Simple CORPORA list crawler☆10Updated 8 years ago
- Multilingual Language Modeling Toolkit☆11Updated 7 years ago
- Python package for stylometry☆63Updated 4 years ago
- Stanford Tregex-inspired language for rule-based dependency tree manipulation.☆21Updated 8 years ago
- Language data store and linguistic query API☆39Updated last month
- A queue-controlled browser automation tool for improving web crawl quality☆61Updated last month
- General Architecture for Text Engineering☆49Updated 9 years ago
- 2016 Presidential Campaign Speeches☆15Updated 8 years ago
- Entity linker for the newspaper collection of the National Library of the Netherlands. Links named entity mentions to DBpedia description…☆11Updated 2 years ago
- Named Entities Recognition Annotator Tool for Europeana Newspapers☆60Updated 7 years ago
- A Python toolkit converting pronunciation in enwiktionary xml dump to cmudict format☆33Updated 5 years ago
- "Old SFM" -- manage rules and streams from social data sources, starting with twitter.☆86Updated last year
- WordNet-LMF formats☆21Updated 2 months ago
- A tool for automatic spelling normalization☆20Updated 4 years ago