An open-access corpus of conversational bilingual speech in Cantonese and English
☆40Apr 28, 2022Updated 4 years ago
Alternatives and similar repositories for SpiCE-Corpus
Users that are interested in SpiCE-Corpus are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A database of number names for 186 languages, locales, and scripts☆67Mar 3, 2023Updated 3 years ago
- 《香港二十世紀中期粵語語料庫》打包器☆16Apr 12, 2016Updated 10 years ago
- Parallelized automatic corpus collection for ASR. Forked from https://github.com/EgorLakomkin/KTSpeechCrawler☆23Mar 21, 2021Updated 5 years ago
- 문장단위로 분절된 나무위키 데이터셋. Releases에서 다운로드 받거나, tfds-korean을 통해 다운로드 받으세요.☆19Jun 16, 2021Updated 5 years ago
- PyTorch speech2text inference script for the NVidia openseq2seq wav2letter model variant☆10Aug 12, 2019Updated 6 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- 早期的计算机使用7位的ASCII编码,为了处理汉字,程序员设计了用于简体中文的GB2312和用于繁体中文的big5。 GB2312(1980年)一共收录了7445个字符,包括6763个汉字和682个其它符号。汉字区的内码范围高字节从B0-F7,低字节从A1-FE,占用的码…☆10Sep 10, 2017Updated 8 years ago
- Minimal Tensorflow Docker image with SyntaxNet/DRAGNN based on Alpine linux☆32Oct 7, 2020Updated 5 years ago
- The Shmoop Corpus☆17Oct 27, 2020Updated 5 years ago
- The code for the ISMIR 2019 paper “Supervised symbolic music style translation using synthetic data”.☆28Nov 21, 2022Updated 3 years ago
- ☆88Mar 11, 2020Updated 6 years ago
- Resources for "Simple Speech Representation Learning from Perceptual Data".☆11Sep 18, 2023Updated 2 years ago
- KoParadigm: Korean Inflectional Paradigm Generator☆59Nov 23, 2022Updated 3 years ago
- Korean Abstract Meaning Representation (AMR) Corpus☆10Feb 27, 2022Updated 4 years ago
- Official implementation of SIGIR 2022 Paper "Task-Oriented Dialogue System as Natural Language Generation".☆14Apr 6, 2022Updated 4 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- NeuralWOZ: Learning to Collect Task-Oriented Dialogue via Model-based Simulation (ACL-IJCNLP 2021)☆36Jul 22, 2021Updated 4 years ago
- A curated list of resources dedicated to Natural Language Processing (NLP) of Cantonese | 粵語 NLP☆94Oct 17, 2021Updated 4 years ago
- [Findings of NAACL2022] A Dog Is Passing Over The Jet? A Text-Generation Dataset for Korean Commonsense Reasoning and Evaluation☆11May 27, 2022Updated 4 years ago
- Reference implementation of the paper "Word Embeddings for Entity-annotated Texts"☆18Apr 12, 2019Updated 7 years ago
- This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to…☆45May 25, 2021Updated 5 years ago
- A "Crowd-Built" continuously growing speech dataset with transcripts. The dataset contains multiple languages and is intended for anyone …☆43Aug 3, 2022Updated 3 years ago
- Google's TPGST reimplementation.☆34Dec 11, 2019Updated 6 years ago
- Wikipedia EXhaustive Entity Annotator (LREC 2020)☆16Apr 22, 2024Updated 2 years ago
- 세종 구문 분석 말뭉치의 의존 구문 구조로의 변환 도구☆10Sep 7, 2018Updated 7 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Prosody-semantics Interface in Seoul Korean☆12Oct 9, 2020Updated 5 years ago
- Official implementation of EMNLP 2021 Paper "Rethinking Zero-shot Neural Machine Translation: From a Perspective of Latent Variables"☆12May 15, 2023Updated 3 years ago
- SOTA punctation restoration (for e.g. automatic speech recognition) deep learning model based on BERT pre-trained model☆182May 17, 2019Updated 7 years ago
- ☆11Aug 12, 2020Updated 5 years ago
- 汉字字符特征提取工具,可以提取出字符中的字音(声母、韵母、声调)、字形(偏旁、部首)、四角编码等特征,同时可作为tensor输入到模型☆137May 25, 2020Updated 6 years ago
- Hanzi Converter for Traditional and Simplified Chinese☆190Mar 28, 2020Updated 6 years ago
- g2pC: A Context-aware Grapheme-to-Phoneme Conversion module for Chinese☆245Jul 10, 2019Updated 6 years ago
- ☆13Nov 30, 2022Updated 3 years ago
- Official PyTorch implementation of the paper "Robust Training for Speaker Verification against Noisy Labels" in INTERSPEECH 2023.☆12Oct 23, 2023Updated 2 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- ☆13Jul 26, 2021Updated 4 years ago
- ☆17Jun 30, 2020Updated 5 years ago
- Pronunciation lexicon covering both English and Chinese languages for Automatic Speech Recognition.☆263Oct 11, 2019Updated 6 years ago
- A proof-of-concept app using KeenASR SDK on Android. WE ARE HIRING: https://keenresearch.com/careers.html☆29Mar 17, 2026Updated 2 months ago
- Sequence-to-sequence TTS based on Kyubyong's dc_tts☆61Feb 2, 2023Updated 3 years ago
- Build a dialog dataset from online books in many languages☆75Oct 25, 2022Updated 3 years ago
- Python classes for the Buckeye Corpus☆26Mar 30, 2018Updated 8 years ago