Toluwase / Word-Level-Language-Identification-for-Resource-Scarce-
English, Hausa, Igbo and Yoruba corpora and results (presented in excel files) of word-level language identification research using the character trigram of the featured languages
☆15Updated 6 years ago
Alternatives and similar repositories for Word-Level-Language-Identification-for-Resource-Scarce-:
Users that are interested in Word-Level-Language-Identification-for-Resource-Scarce- are comparing it to the libraries listed below
- Yorùbá language training text for NLP, ASR and TTS tasks☆73Updated last year
- Ìrànlọ́wọ́ is a utility library for analysis & (pre)processing of Yorùbá text → https://pypi.org/project/iranlowo☆17Updated 2 years ago
- Automatic Diacritic Restoration of Yorùbá language Text☆24Updated 6 months ago
- Unsupervised Neural Machine Translation from West African Pidgin (Creole) to English without a single parallel sentence☆75Updated 4 years ago
- All our community docs! Start here! Lets put Africa on the NLP Map☆55Updated 9 months ago
- ☆48Updated 3 years ago
- Machine Translation for Africa☆279Updated 2 years ago
- Python library for converting numbers to words for all Indian Languages.☆35Updated 3 weeks ago
- A repository for publicly/freely available Natural Language Processing (NLP) datasets for African languages.☆99Updated 9 months ago
- Automatically constructing corpus for automatic speech recognition from YouTube videos☆153Updated 4 years ago
- Server framework for Kaldi ASR Toolkit☆98Updated last year
- This is a package in Python which implements a tokenizer, stemmer for Hindi language☆91Updated 4 years ago
- Complimentary code for our paper Automatic punctuation restoration with BERT models☆48Updated last year
- This repo contains 3 hours of audio speech recordings in Yoruba language collected for research purposes.☆14Updated 4 years ago
- The Dakshina dataset is a collection of text in both Latin and native scripts for 12 South Asian languages. For each language, the datase…☆192Updated 4 years ago
- Program to benchmark various speech recognition APIs☆80Updated 5 years ago
- A Python based API to access Indian language WordNets.☆37Updated 2 years ago
- A curated list of research papers and resources on code-switching☆302Updated last month
- Arabic Dialect Identification on AOC data.☆23Updated 5 years ago
- ☆43Updated 2 years ago
- A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.☆14Updated 5 years ago
- Punctuation Restoration using Transformer Models for High-and Low-Resource Languages☆205Updated 6 months ago
- A Python library for measuring the acoustic features of speech (simultaneous speech, high entropy) compared to ones of native speech.☆241Updated 2 years ago
- Xlit-Crowd: Hindi-English Transliteration Corpus☆37Updated 9 years ago
- Pretraining, fine-tuning and evaluation scripts for Indic-Wav2Vec2☆82Updated 10 months ago
- This is where I put all my work in Natural Language Processing☆96Updated 3 years ago
- Punctuation restoration and spell correction experiments.☆250Updated 3 years ago
- SOTA punctation restoration (for e.g. automatic speech recognition) deep learning model based on BERT pre-trained model☆180Updated 5 years ago
- Training an n-gram based Language Model using KenLM toolkit for Deep Speech 2☆113Updated 5 years ago
- The repository contains all the codes necessary for my project - Automatic Speech Recognition System in Hindi Language ( Project descript…☆28Updated 5 years ago