The-Gupta / TED-ScraperLinks
Complete Web Scraping of TED.com for Metadata, Transcript, Audio, Video, Images using Parallel Programming
☆11Updated 5 years ago
Alternatives and similar repositories for TED-Scraper
Users that are interested in TED-Scraper are comparing it to the libraries listed below
Sorting:
- Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages☆13Updated 2 years ago
- Many Natural Language Processing tasks rely on sentence boundary detection (SBD). Although amazing libraries like spacy provide state of …☆61Updated 5 years ago
- Code for extracting parallel corpora from pmindia☆16Updated 5 years ago
- Transcribing audio files using Hugging Face's implementation of Wav2Vec2 + "chain-linking" NLP tasks to combine speech-to-text with downs…☆32Updated 4 years ago
- OCTRA is a web-application for the orthographic transcription of audio files.☆39Updated this week
- This repository contains the implementation of the paper: "Span Classification with Structured Information for Disfluency Detection in Sp…☆13Updated 2 years ago
- Experiments with Hugging Face 🔬 🤗☆44Updated last year
- AYLIEN's officially supported Python client library for accessing News API☆18Updated 3 years ago
- The first, open access evaluation dataset for methods to identify bias by word choice and labeling☆25Updated 2 years ago
- Codebase for Indic-Transliteration using Seq2Seq RNN. For latest repo with Transformer-based models, check: https://github.com/AI4Bharat/…☆60Updated 4 years ago
- Parallelized automatic corpus collection for ASR. Forked from https://github.com/EgorLakomkin/KTSpeechCrawler☆24Updated 4 years ago
- Language identification and normalisation in code switching data tailored with a three-step decoding process☆24Updated 5 years ago
- The RadioTalk dataset of talk radio transcripts☆60Updated 4 years ago
- Post-processing OCR errors with seq2seq models☆28Updated 5 years ago
- YT_subtitles - extracts subtitles from YouTube videos to raw text for Language Model training☆45Updated 5 years ago
- This code provides word level language identification tool for identifying language for individual words in Code-Mixed text. e.g. The tex…☆55Updated 5 years ago
- Code accompanying the submission "Structural Text Segmentation of Legal Documents" by Aumiller et al.☆97Updated 2 years ago
- ☆10Updated 7 years ago
- An ongoing series of notebooks aimed at helping fellow NLP enthusiasts think about applying new tools and techniques to practical tasks.☆18Updated 4 years ago
- ☆13Updated last year
- Text and Punctuation correction with Deep Learning☆128Updated 5 years ago
- 📄Neural Sentential Paraphrase Generation to Augment Chatbot Training Dataset☆21Updated 2 years ago
- A curated list of Natural Language Generation papers, tutorials, and blogs.☆12Updated 6 years ago
- ☆76Updated 3 years ago
- Text to Speech for Indic languages☆51Updated 3 years ago
- A simple neural truecaser written in pytorch and allennlp.☆33Updated last year
- Generating English Rock lyrics using BERT☆19Updated 6 years ago
- FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction☆24Updated 3 years ago
- Can fear be used for polarisation and spreading negativity? Our paper accepted in The Web conference 2021 tries to explore this question …☆26Updated 2 years ago
- A crash course for training speech recognition models using DeepSpeech.☆25Updated 4 years ago