shunk031 / TedScraper
Scraper for TED Talks in Python. Get talk title, transcript, talk topics and so on.
☆15Updated 7 years ago
Related projects ⓘ
Alternatives and complementary repositories for TedScraper
- Finds linguistic patterns effortlessly☆33Updated last year
- A PDFMiner wrapper to ease the text extraction from pdf files.☆25Updated 11 years ago
- Some convenient natural language tools that build on NLTK.☆85Updated 10 years ago
- Automatically exported from code.google.com/p/guess-language☆53Updated 9 months ago
- Command-line corpus tools☆9Updated 7 years ago
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆61Updated 6 months ago
- clone of https://code.google.com/p/splitta/ so it can be a git submodule☆34Updated 11 years ago
- Parse a text corpus and generate sentences in the same style using context-free grammar combined with a Markov chain.☆34Updated 5 years ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆42Updated 5 years ago
- Multilingual Language Modeling Toolkit☆11Updated 7 years ago
- python-timbl, originally developed by Sander Canisius, is a Python extension module wrapping the full TiMBL C++ programming interface. Wi…☆18Updated 3 weeks ago
- A natural language date parser. (Python version of chrono.js)☆25Updated 5 months ago
- A programmable relation extraction tool☆31Updated last year
- Python API for KB data-services☆18Updated 4 years ago
- Generic Environment for Context-Aware Correction of Orthography☆22Updated 2 years ago
- A simple utility to extract text from EPUB documents and, optionally, format it☆48Updated 4 years ago
- A re-implementation of redpony/cdec's tokenize-anything.pl script in python☆8Updated 8 years ago
- API server for NLTK☆23Updated 7 years ago
- This repository for Web Crawling, Information Extraction, and Knowledge Graph build up.☆33Updated 6 years ago
- Analyze Argumentation and Rhetorical Aspects in Scientific Writing.☆19Updated 2 years ago
- A proofreading tool using Google's N-gram corpus.☆11Updated 2 years ago
- ☆31Updated 3 years ago
- ADS Project☆14Updated 8 years ago
- Wikipedia API wrapper for humans and elk. (en.wikipedia.org/w/api.php, get it?)☆36Updated 10 years ago
- Easily identify and label sentence intervals using various taggers.☆16Updated 7 years ago
- Python SDK for the TextRazor Text Analytics API☆20Updated last year
- Post-processing OCR errors with seq2seq models☆28Updated 4 years ago
- MinScIE is an Open Information Extraction system which provides structured knowledge enriched with semantic information about citations.☆15Updated 5 years ago
- Stanford Tregex-inspired language for rule-based dependency tree manipulation.☆21Updated 7 years ago