puntonim / gutenberg-bulk-downloader
Bulk downloader for free ebooks hosted at Project Gutenberg
☆17Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for gutenberg-bulk-downloader
- python-timbl, originally developed by Sander Canisius, is a Python extension module wrapping the full TiMBL C++ programming interface. Wi…☆18Updated 2 weeks ago
- Recipes for training OpenNMT systems☆14Updated 7 years ago
- Framework for creating and accessing UBY resources – sense-linked lexical resources in standard UBY-LMF format☆22Updated 6 years ago
- Basic dataset for the linguistic data collection.☆15Updated 7 years ago
- ☆31Updated 3 years ago
- Wiktionary parser tool for many language editions.☆53Updated 2 years ago
- Polyglot is a language identifier for detecting text documents containing text written in more than one language, and for identifying the…☆32Updated 8 years ago
- Program used to split text into segments☆25Updated 2 weeks ago
- Multilingual toolkit for NLP: dependency parser, PoS tagger, NERC, multiword extractor, sentiment analysis, etc.☆64Updated 8 months ago
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆60Updated 5 months ago
- Finds linguistic patterns effortlessly☆33Updated last year
- Python bindings to the dutch NLP tool Frog (pos tagger, lemmatiser, NER tagger, morphological analysis, shallow parser, dependency parser…☆47Updated 2 weeks ago
- Simple CORPORA list crawler☆10Updated 7 years ago
- eXternally configurable REference and Non Named Entity Recognizer☆17Updated 4 months ago
- LaMachine - A software distribution of our in-house as well as some 3rd party NLP software - Virtual Machine, Docker, or local compilatio…☆68Updated last year
- This is a REST Server endpoint built using Flask and Python.☆24Updated last year
- A web-based, token-level annotation tool for non-standard language data☆10Updated 4 years ago
- A Named-Entity Recogniser based on Grobid.☆49Updated last month
- A spell-checker extending Peter Norvig's with multi-typo correction, hamming distance weighting, and more.☆97Updated 4 years ago
- Analyze and extract Wikipedia article text and attributes and store them into an ElasticSearch index or to json files (multilingual suppo…☆46Updated last year
- Generic Environment for Context-Aware Correction of Orthography☆22Updated 2 years ago
- Multi Tier Annotation Search☆26Updated 3 years ago
- A tool for analyzing the word histories of a text.☆34Updated 3 months ago
- Interactive visualization of Wiktionary words and etymologies.☆90Updated last week
- Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic pr…☆65Updated last month
- This repository contains code behind the visualization of the Wikimedia tool etytree at http://tools.wmflabs.org/etytree/☆50Updated 5 years ago
- Linguistic search for large annotated text corpora, based on Apache Lucene☆106Updated this week
- API for WOLF, a free French WordNet☆13Updated 6 years ago
- A workflow system for Natural Language Processing.☆21Updated 5 years ago
- ☆27Updated 7 years ago