shunk031 / TedScraper
Scraper for TED Talks in Python. Get talk title, transcript, talk topics and so on.
☆15Updated 7 years ago
Alternatives and similar repositories for TedScraper:
Users that are interested in TedScraper are comparing it to the libraries listed below
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆63Updated 10 months ago
- Finds linguistic patterns effortlessly☆36Updated last year
- ☆14Updated 2 years ago
- Command-line corpus tools☆9Updated 7 years ago
- Parser for KAF NAF files written in Python☆16Updated 3 years ago
- Multilingual Language Modeling Toolkit☆11Updated 7 years ago
- ☆40Updated 7 years ago
- Tools to evaluate accuracies of various (research papers') metadata extraction libraries☆11Updated 9 years ago
- Stanford Tregex-inspired language for rule-based dependency tree manipulation.☆21Updated 8 years ago
- Convert a corpus of PDF to clean text files on a distributed architecture☆38Updated last year
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆51Updated 3 years ago
- FoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.g…☆112Updated 2 months ago
- Pipeline for distributed Natural Language Processing, made in Python☆65Updated 8 years ago
- Framework for creating and accessing UBY resources – sense-linked lexical resources in standard UBY-LMF format☆22Updated 6 years ago
- U.S. Code Complexity☆23Updated 11 years ago
- Regex like pattern tree matching but on sentence's tree instead of Strings☆42Updated 7 years ago
- The projects lets you extract glossary words and their definitions from a given piece of text automatically using NLP techniques☆29Updated 4 years ago
- Hierarchical phrase-based machine translation system☆32Updated 10 years ago
- bigram / trigram analysis of wikipedia; mainly mutual info☆22Updated 13 years ago
- Data from https://aclweb.org/anthology/☆16Updated 4 years ago
- A PDF collection reader with built-in full-text search engine☆19Updated 7 years ago
- A pipeline for detecting novel information about entities from a stream of text, updating a knowledge base about the entities, and genera…☆32Updated 5 years ago
- The zhong [|] Chinese grammars☆14Updated 3 years ago
- A Large Automatically-Constructed Resource of Predicate Paraphrases☆45Updated 5 years ago
- Temporal Expression Recognition and Normalisation in Python☆78Updated 9 years ago
- Crawling and analyzing data on Wikipedia☆16Updated last year
- Simple spaCy-based concept extraction API, involving a dictionary of relevant concepts.☆10Updated 5 years ago
- OKR: A Consolidated Open Knowledge Representation for Multiple Texts☆41Updated 7 years ago
- WordNet to neo4j 2.2☆12Updated 9 years ago
- Extract Data from Wikipedia Lists☆31Updated 7 years ago