The-Gupta / TED-ScraperLinks
Complete Web Scraping of TED.com for Metadata, Transcript, Audio, Video, Images using Parallel Programming
☆11Updated 5 years ago
Alternatives and similar repositories for TED-Scraper
Users that are interested in TED-Scraper are comparing it to the libraries listed below
Sorting:
- Post-processing OCR errors with seq2seq models☆28Updated 5 years ago
- Code for extracting parallel corpora from pmindia☆16Updated 5 years ago
- Codebase for HYPHEN, accepted at ACL 2022 (main)☆11Updated 3 years ago
- (ACL 2022) The source code for the paper "Towards Abstractive Grounded Summarization of Podcast Transcripts"☆17Updated 2 years ago
- Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages☆13Updated 3 years ago
- Many Natural Language Processing tasks rely on sentence boundary detection (SBD). Although amazing libraries like spacy provide state of …☆62Updated 5 years ago
- Codebase for Indic-Transliteration using Seq2Seq RNN. For latest repo with Transformer-based models, check: https://github.com/AI4Bharat/…☆60Updated 4 years ago
- Text and Punctuation correction with Deep Learning☆128Updated 5 years ago
- Language identification and normalisation in code switching data tailored with a three-step decoding process☆24Updated 5 years ago
- Collaborative on-line editor for aligned parallel texts.☆13Updated 2 weeks ago
- ☆44Updated 4 years ago
- This code provides word level language identification tool for identifying language for individual words in Code-Mixed text. e.g. The tex…☆56Updated 5 years ago
- This repository contains the implementation of the paper: "Span Classification with Structured Information for Disfluency Detection in Sp…☆15Updated 2 years ago
- Code and data for Teddy https://arxiv.org/abs/2001.05171.☆15Updated 3 years ago
- Crawling engine that crawls a set of top-level domains looking for documents in a list of languages☆11Updated last year
- This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around t…☆33Updated 2 years ago
- Gamma Agreement in Python☆45Updated last year
- A simple neural truecaser written in pytorch and allennlp.☆33Updated last year
- Generating English Rock lyrics using BERT☆19Updated 6 years ago
- Interface for using TTS and vocoder models in the form of a text editor☆19Updated 2 weeks ago
- A package for fine-tuning Transformers with TPUs, written in Tensorflow2.0+☆37Updated 4 years ago
- Paraphrase Generation model using pair-wise discriminator loss☆46Updated 4 years ago
- On Generating Extended Summaries of Long Documents☆78Updated 4 years ago
- [WIP] Behold, semantic-search, built over sentence-transformers to make it easy for search engineers to evaluate, optimise and deploy mod…☆15Updated 2 years ago
- ☆10Updated 7 years ago
- several algorithms for converting dependency structures into constituency structures.☆10Updated 3 years ago
- BERT models for many languages created from Wikipedia texts☆33Updated 5 years ago
- official repo for AAAI ALOHA chatbot☆29Updated last year
- Can fear be used for polarisation and spreading negativity? Our paper accepted in The Web conference 2021 tries to explore this question …☆26Updated 2 years ago
- Transcribing audio files using Hugging Face's implementation of Wav2Vec2 + "chain-linking" NLP tasks to combine speech-to-text with downs…☆32Updated 4 years ago