kevinxiong / epub2txt
convert epub file to txt
☆85Updated 4 years ago
Alternatives and similar repositories for epub2txt:
Users that are interested in epub2txt are comparing it to the libraries listed below
- Python module that identifies Chinese text as being Simplified or Traditional☆91Updated 4 months ago
- List of English synonyms and antonyms parsed from the public domain book of James C. Fernald, 1896☆43Updated 6 years ago
- python based software to unpack kindlegen generated ebooks☆62Updated 2 years ago
- Library for extracting text and timestamps from multiple subtitle files (.ass, .ssa, .srt, .sub, .txt).☆52Updated last year
- Multi-class text categorization using state-of-the-art pre-trained contextualized language models, e.g. BERT☆21Updated 2 years ago
- The New York Times English-Chinese parallel corpus☆16Updated 3 years ago
- Multilingual sentence alignment using sentence embeddings☆113Updated 4 months ago
- Tokenizer POS-Tagger and Dependency-parser with BERT/RoBERTa/DeBERTa/GPT models for Japanese and other languages☆50Updated 2 months ago
- Offline bilingual dictionaries made using data from Wiktionary☆53Updated 9 years ago
- Convert epub file to txt☆31Updated last year
- A simple command-line utility for Linux, for extracting text from EPUB documents.☆214Updated 3 weeks ago
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆241Updated 2 years ago
- A natural language date parser. (Python version of chrono.js)☆25Updated 10 months ago
- download youtube subtitles(closed caption, cc) as txt or json, support translation and proxy. available on PIP 🐍 . try it online at goo…☆70Updated last year
- Converts between traditional and simplified Chinese☆30Updated 6 months ago
- A python module for word inflections designed for use with spaCy.☆92Updated 5 years ago
- A python true casing utility that restores case information for texts☆88Updated 2 years ago
- arXiv plain text extraction☆41Updated 2 years ago
- 中文古诗词语料库☆26Updated 8 years ago
- máobĭ (毛笔) is an Anki add-on to create cards with writing quizzes for Hanzi (Chinese characters)☆53Updated 4 months ago
- Boilerplate Removal using Deep Learning☆82Updated 3 years ago
- Extract dates from text☆64Updated 4 years ago
- An Android dictionary application with support for mdx format.☆10Updated 2 years ago
- Python library for CJK (Chinese, Japanese, and Korean) language dictionary☆89Updated last week
- Export UNIHAN's database to csv, json or yaml☆55Updated last week
- Many Natural Language Processing tasks rely on sentence boundary detection (SBD). Although amazing libraries like spacy provide state of …☆61Updated 4 years ago
- Scorer for grammatical error correction systems.☆14Updated 9 years ago
- bilingual dictionary extractor from parallel corpora☆22Updated 10 years ago
- 汉字五笔转换工具☆33Updated 6 years ago
- A mini Anki web server based on Flask, works with anki-sync-server.☆36Updated last year