wa3dbk/ScribeSalad

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/wa3dbk/ScribeSalad)

wa3dbk / ScribeSalad

A collection of YouTube videos transcripts : Podcasts (Joe Rogan Experience, Tim Ferris, Jocko podcast, ..), lectures (YaleCourses, MIT lectures, ..). A big transcripts salad spanning history, geography, science, politics, film making and more.

☆87

Alternatives and similar repositories for ScribeSalad

Users that are interested in ScribeSalad are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

mzboito / IWSLT2022_Tamasheq_data
View on GitHub
Repository for sharing the data in the Tamasheq language, one of the target languages for the low-resource speech translation track at IW…
☆18Nov 30, 2022Updated 3 years ago
lukasheinrich / generative_models_examples
View on GitHub
playground for generative models
☆11Jun 23, 2024Updated 2 years ago
BUTSpeechFIT / ASR-hybrid-decoding
View on GitHub
☆17Nov 25, 2019Updated 6 years ago
merlresearch / reverberation-as-supervision
View on GitHub
Enhanced Reverberation As Supervision (ERAS) for unsupervised reverberant speech separation
☆15Aug 1, 2024Updated last year
achendrick / jrescribe-transcripts
View on GitHub
Full transcripts for the Joe Rogan Experience podcast utilized in a VuePress site.
☆43May 27, 2019Updated 7 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
boun-tabi / BounTi-Turkish-Sentiment-Analysis
View on GitHub
Twitter Dataset and Finetuned Transformer Model for Turkish Sentiment Analysis
☆14Jul 29, 2022Updated 4 years ago
igormq / ctcdecode-pytorch
View on GitHub
Python implementation of CTC beam search decoder + agnostic LM scorer
☆20Dec 16, 2020Updated 5 years ago
fgnt / paderbox
View on GitHub
Paderbox: A collection of utilities for audio / speech processing
☆43Jul 21, 2025Updated last year
osmanuygar / turkish-text-classification-api
View on GitHub
☆10Jan 19, 2023Updated 3 years ago
arysin / nlp_uk_api
View on GitHub
☆11Oct 19, 2024Updated last year
etzinis / fedenhance
View on GitHub
Code for the paper: Separate but togerher: Unsupervised Federated Learning for Speech Enhancement from non-iid data
☆41Nov 1, 2021Updated 4 years ago
gpu-poor / gramvaani_hindi_asr
View on GitHub
This repo contains the baseline model recipes and pre-trained model for GramVanni hindi ASR challenge
☆16Mar 26, 2022Updated 4 years ago
turnerdan / joethecorpusrogan
View on GitHub
A corpus of speech from the Joe Rogan Experience podcast, consisting of 8.43 million words. It includes aligned TextGrids with phonetic a…
☆21Jan 26, 2020Updated 6 years ago
fakufaku / create_wsj1_2345_db
View on GitHub
Collection of scripts to create a dataset of noisy multi-channel reverberant mixtures based on wsj1 and CHiME3 datasets.
☆15Dec 6, 2021Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
MiniXC / phones
View on GitHub
A collection of utilities for handling IPA phones.
☆27Sep 24, 2023Updated 2 years ago
joshuakalla / data_science_campaigns
View on GitHub
Data Science and Political Campaigns Course at Yale
☆15Nov 6, 2025Updated 8 months ago
desh2608 / kaldi-noise-vectors
View on GitHub
Implementation of different noise embeddings for noise aware training of Kaldi acoustic models.
☆13Feb 13, 2021Updated 5 years ago
audiolabs / PESQ
View on GitHub
PESQ (Perceptual Evaluation of Speech Quality) Wrapper for Python Users (narrow band and wide band) - including P.862 Corrigendum 2 (03/…
☆23May 27, 2025Updated last year
PyThaiNLP / thai-g2p-wiktionary-corpus
View on GitHub
Thai Grapheme to Phoneme (G2P) Wiktionary Corpus
☆13Jul 25, 2022Updated 4 years ago
TheChymera / LabbookDB
View on GitHub
Lab Book Database Framework with Input, Output, and Reporting Functions
☆14Jul 18, 2022Updated 4 years ago
psoulos / role-decomposition
View on GitHub
☆11Feb 11, 2020Updated 6 years ago
motazsaad / ara-pronunciation-tool
View on GitHub
A python tool that converts Arabic diacritised text to a sequence of phonemes and creates a pronunciation dictionary. This code is based …
☆15Sep 5, 2017Updated 8 years ago
coryshain / dnnseg
View on GitHub
☆11Mar 20, 2021Updated 5 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
google / df-conformer
View on GitHub
Audio samples accompanying publications related to DF-Conformer, a speech enhancement model.
☆36Jun 23, 2026Updated last month
aispeech-lab / TinyWASE
View on GitHub
PyTorch implementation of TinyWASE described in our paper "Compressing Speaker Extraction Model with Ultra-low Precision Quantization and…
☆11Jun 28, 2021Updated 5 years ago
acatovic / textrank
View on GitHub
Simple and clean Python implementation of TextRank as per seminal paper by Rada Mihalcea and Paul Tarau. This implementation performs bot…
☆11Jan 26, 2021Updated 5 years ago
jrnold / intro-methods-notes
View on GitHub
Notes for political science introductory methods sequence
☆18May 23, 2018Updated 8 years ago
dan-wells / kiss-aligner
View on GitHub
Simple Kaldi recipe for forced alignment
☆11Jul 16, 2023Updated 3 years ago
Kaljurand / net-speech-api
View on GitHub
Java API for the online speech recognition services provided by phon.ioc.ee
☆18Jun 4, 2021Updated 5 years ago
giellalt / lang-crk
View on GitHub
Finite state and Constraint Grammar based analysers and proofing tools, and language resources for the Plains Cree language
☆16Updated this week
daanzu / wenet_stt_python
View on GitHub
☆33Nov 27, 2021Updated 4 years ago
MTG / Podcastmix
View on GitHub
PodcastMix A dataset for separating music and speech in podcasts.
☆44Aug 20, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
jcheng5 / user2015
View on GitHub
Sample materials for useR2015
☆11Jul 2, 2015Updated 11 years ago
soumimaiti / speechlmscore_tool
View on GitHub
☆34Nov 24, 2024Updated last year
xinjli / phonepiece
View on GitHub
phone inventory library
☆17May 15, 2023Updated 3 years ago
google-research / last
View on GitHub
A JAX library for building lattice-based speech transducer models
☆48Jul 2, 2026Updated 3 weeks ago
rsprouse / xray_microbeam_database
View on GitHub
Annotations and scripts for use with University of Wisconsin X-Ray Microbeam Speech Production Database (1994)
☆14Oct 8, 2020Updated 5 years ago
ccoreilly / deepspeech-catala
View on GitHub
Deepspeech ASR Model for the Catalan Language
☆17Feb 15, 2021Updated 5 years ago
b-sigpro / spectral-feature-compression
View on GitHub
The source code for Input-Adaptive Spectral Feature Compression by Sequence Modeling for Source Separation published in IEEE TASLPRO.
☆18Jun 3, 2026Updated last month