The-Gupta / TED-ScraperLinks
Complete Web Scraping of TED.com for Metadata, Transcript, Audio, Video, Images using Parallel Programming
☆11Updated 5 years ago
Alternatives and similar repositories for TED-Scraper
Users that are interested in TED-Scraper are comparing it to the libraries listed below
Sorting:
- The RadioTalk dataset of talk radio transcripts☆61Updated 4 years ago
- The first, open access evaluation dataset for methods to identify bias by word choice and labeling☆26Updated 3 months ago
- Code for extracting parallel corpora from pmindia☆17Updated 6 years ago
- Language identification and normalisation in code switching data tailored with a three-step decoding process☆24Updated 6 years ago
- Transcribing audio files using Hugging Face's implementation of Wav2Vec2 + "chain-linking" NLP tasks to combine speech-to-text with downs…☆32Updated 4 years ago
- Many Natural Language Processing tasks rely on sentence boundary detection (SBD). Although amazing libraries like spacy provide state of …☆62Updated 5 years ago
- This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around t…☆34Updated 2 years ago
- Generating English Rock lyrics using BERT☆19Updated 6 years ago
- COVID-19 Question Dataset from the paper "What Are People Asking About COVID-19? A Question Classification Dataset"☆24Updated 5 years ago
- Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages☆13Updated 3 years ago
- official repo for AAAI ALOHA chatbot☆29Updated 2 years ago
- Using BERT for doing the task of Conditional Natural Language Generation by fine-tuning pre-trained BERT on custom dataset.☆41Updated 5 years ago
- This repository contains the implementation of the paper: "Span Classification with Structured Information for Disfluency Detection in Sp…☆15Updated 2 years ago
- Crawling engine that crawls a set of top-level domains looking for documents in a list of languages☆11Updated 2 years ago
- Testing and training detection models for emoji-based hate speech.☆24Updated 3 years ago
- On Generating Extended Summaries of Long Documents☆78Updated 5 years ago
- A Cross-Domain Transferable Neural Coherence Model https://arxiv.org/abs/1905.11912☆24Updated 5 years ago
- Pun-GAN: Generative Adversarial Network for Pun Generation (EMNLP 2019)☆42Updated 6 years ago
- Code and datasets for the paper "Humor Detection: A Transformer Gets the Last Laugh"☆82Updated 2 years ago
- Cross-lingual Fact-to-Text Alignment and Generation for Low-Resource Languages☆11Updated 3 years ago
- Running Mozilla's implementation of Baidu DeepSpeech on Google Colaboratory☆16Updated 6 years ago
- A corpus of comments tagged for multiple attributes of unhealthiness.☆36Updated 4 years ago
- An attempt to use a Generative Adversarial Network (GAN) for natural language generation.☆16Updated 7 years ago
- A package for fine-tuning Transformers with TPUs, written in Tensorflow2.0+☆37Updated 4 years ago
- ☆76Updated 4 years ago
- Can fear be used for polarisation and spreading negativity? Our paper accepted in The Web conference 2021 tries to explore this question …☆26Updated 2 years ago
- Parallelized automatic corpus collection for ASR. Forked from https://github.com/EgorLakomkin/KTSpeechCrawler☆23Updated 4 years ago
- ColBERT humor dataset for the task of humor detection, containing 200,000 jokes/news☆75Updated last year
- Automatic Speech Recognition Dataset Generation☆37Updated 7 years ago
- OCTRA is a web-application for the orthographic transcription of audio files.☆39Updated last week