shunk031 / TedScraperLinks
Scraper for TED Talks in Python. Get talk title, transcript, talk topics and so on.
☆15Updated 7 years ago
Alternatives and similar repositories for TedScraper
Users that are interested in TedScraper are comparing it to the libraries listed below
Sorting:
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆65Updated last year
- Automatically exported from code.google.com/p/guess-language☆53Updated last year
- Python binding to libpoppler with focus on text extraction☆97Updated 3 years ago
- A spell-checker extending Peter Norvig's with multi-typo correction, hamming distance weighting, and more.☆98Updated 4 years ago
- All TED talks narratives extracted and cleaned.☆100Updated 7 years ago
- Command-line corpus tools☆9Updated 8 years ago
- A command line interface to run scripts on Anki☆21Updated 4 years ago
- ☆34Updated 11 years ago
- Random fun with statistical language models.☆64Updated 5 years ago
- Simple Flask webservice to search through your PDF collection using Whoosh☆11Updated 11 years ago
- Multi-Entity Extraction Framework for Academic Documents (with default extraction tools)☆31Updated last year
- Offline bilingual dictionaries made using data from Wiktionary☆56Updated 10 years ago
- Python interface to IMDb plain-text data files☆41Updated 7 years ago
- A Python library for extracting semantic information from text, such as dates and numbers.☆76Updated 3 years ago
- A PDF collection reader with built-in full-text search engine☆19Updated 8 years ago
- Hierarchical phrase-based machine translation system☆32Updated 10 years ago
- Home is where the dotfiles are.☆11Updated 2 months ago
- Scrapes some Finnish word definitions from English Wiktionary.☆8Updated last year
- ⌨️🌼 Syntax highlighting for Pollen (a Racket language)☆17Updated 5 years ago
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆52Updated 4 years ago
- 📇 Import pdf slides + text notes into Anki.☆25Updated 4 years ago
- Extract a plain text corpus from MediaWiki XML dumps, such as Wikipedia.☆133Updated 6 years ago
- ☆26Updated 6 years ago
- PostgreSQL docset for Dash (http://kapeli.com/dash/)☆39Updated 13 years ago
- Wikipedia API wrapper for humans and elk. (en.wikipedia.org/w/api.php, get it?)☆36Updated 10 years ago
- Quickly turn command-line applications into RESTful webservices with a web-application front-end. You provide a specification of your com…☆130Updated 4 months ago
- Wrapper for pdftohtml that tries to extract paragraph structure☆50Updated 6 years ago
- A PDFMiner wrapper to ease the text extraction from pdf files.☆25Updated 12 years ago
- Construct your personal API☆18Updated 2 years ago
- ThoughtTreasure commonsense knowledge base and architecture for natural language processing☆79Updated 9 years ago