shunk031 / TedScraper
Scraper for TED Talks in Python. Get talk title, transcript, talk topics and so on.
☆15Updated 7 years ago
Alternatives and similar repositories for TedScraper:
Users that are interested in TedScraper are comparing it to the libraries listed below
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆61Updated 8 months ago
- Scrapes some Finnish word definitions from English Wiktionary.☆7Updated last year
- bilingual dictionary extractor from parallel corpora☆22Updated 10 years ago
- clone of https://code.google.com/p/splitta/ so it can be a git submodule☆34Updated 11 years ago
- Stylometric framework in Python☆13Updated 9 years ago
- A pair of scripts to download videos and subtitles for the TED Talks (http://www.ted.com)☆42Updated 10 years ago
- A workflow system for Natural Language Processing.☆21Updated 5 years ago
- Text conversion tool (from e.g. Word, HTML, txt) to corpus formats TEI or FoLiA)☆23Updated 2 years ago
- Convert a corpus of PDF to clean text files on a distributed architecture☆38Updated 10 months ago
- Recipes for training OpenNMT systems☆14Updated 7 years ago
- A selection of test lines of several early printed books as well as the corresponding individual OCRopus models and mixed models.