shunk031 / TedScraper
Scraper for TED Talks in Python. Get talk title, transcript, talk topics and so on.
☆15Updated 7 years ago
Related projects: ⓘ
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆60Updated 4 months ago
- Convert a corpus of PDF to clean text files on a distributed architecture☆37Updated 6 months ago
- Stylometric framework in Python☆13Updated 9 years ago
- clone of https://code.google.com/p/splitta/ so it can be a git submodule☆34Updated 11 years ago
- Command-line corpus tools☆9Updated 7 years ago
- ☆21Updated this week
- Simple natural language parsing and semantic grounding☆10Updated 3 years ago
- Finds linguistic patterns effortlessly☆31Updated last year
- Python API for KB data-services☆18Updated 4 years ago
- ☆21Updated 7 years ago
- stav text annotation visualiser☆34Updated 12 years ago
- ☆40Updated 6 years ago
- Wikidata embedding☆50Updated last month
- Multi-Entity Extraction Framework for Academic Documents (with default extraction tools)☆29Updated 11 months ago
- Automatically exported from code.google.com/p/guess-language☆53Updated 7 months ago
- Scrapes some Finnish word definitions from English Wiktionary.☆7Updated last year
- Easily identify and label sentence intervals using various taggers.☆16Updated 7 years ago
- Code accompanying our paper "One Knowledge Graph to Rule them All? Analyzing the Differences between DBpedia, YAGO, Wikidata & co."☆26Updated 7 years ago
- The Non-Official Characterization (NOC) List is a knowledge-base containing semantic triples about famous people, living and dead, fictio…☆23Updated 5 years ago
- A PDFMiner wrapper to ease the text extraction from pdf files.☆25Updated 11 years ago
- Wikipedia API wrapper for humans and elk. (en.wikipedia.org/w/api.php, get it?)☆36Updated 10 years ago
- A PDF collection reader with built-in full-text search engine☆19Updated 7 years ago
- PDF Extraction Toolkit☆41Updated 3 years ago
- Recipes for training OpenNMT systems☆14Updated 7 years ago
- This repository contains the Framester resource, the main outcome of the framester project.☆34Updated 4 years ago
- Multilingual Language Modeling Toolkit☆11Updated 7 years ago
- FoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.g…☆110Updated 2 months ago
- U.S. Code Complexity☆22Updated 11 years ago
- Parse a text corpus and generate sentences in the same style using context-free grammar combined with a Markov chain.☆34Updated 5 years ago
- Browser-based annotation tool for Framenet☆16Updated 9 years ago