neulab/newlang-tech

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/neulab/newlang-tech)

neulab / newlang-tech

A guide to building language technology in new languages.

☆59

Alternatives and similar repositories for newlang-tech

Users that are interested in newlang-tech are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

neulab / AfricanVoices
View on GitHub
Hosts text-to-speech corpus and speech synthesizers for African languages.
☆19May 31, 2023Updated 3 years ago
dan-wells / kiss-aligner
View on GitHub
Simple Kaldi recipe for forced alignment
☆11Jul 16, 2023Updated 3 years ago
wavlab-speech / shinjiwlab.github.io
View on GitHub
☆18Updated this week
coqui-ai / data-checker
View on GitHub
🫠 check your data, before you wreck your model
☆16Aug 11, 2022Updated 3 years ago
mattf1n / basis-aware-threshold
View on GitHub
Code for the paper "Closing the Curious Case of Neural Text Degeneration"
☆12Apr 9, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
acl-org / aclrollingreview
View on GitHub
ACL Rolling Review website
☆12Updated this week
pilarOG / unit_selection_tts
View on GitHub
Toy example on how to build a unit selection TTS in Spanish
☆11May 10, 2019Updated 7 years ago
ReadAlongs / Studio
View on GitHub
Audiobook alignment for Indigenous languages
☆45Updated this week
rhasspy / tts-prompts
View on GitHub
Phonetically balanced text to speech sentences
☆10Aug 16, 2021Updated 4 years ago
ehsanasgari / 1000Langs
View on GitHub
Creating super-parallel corpora of more than 1500+ unique languages for NLP research
☆33Dec 8, 2022Updated 3 years ago
chdh / klatt-syn-app
View on GitHub
GUI applikation for the Klatt formant synthesizer package
☆13Jun 26, 2026Updated last month
eddieantonio / unicode-default-word-boundary
View on GitHub
Split words with Unicode's default word boundary specification
☆13Sep 12, 2024Updated last year
antonisa / embeddings
View on GitHub
Data and scripts for the proper evaluation of cross-lingual embeddings in multiple languages
☆15Apr 11, 2020Updated 6 years ago
swarnaHub / SummarizationPrograms
View on GitHub
[ICLR 2023] PyTorch code of Summarization Programs: Interpretable Abstractive Summarization with Neural Modular Trees
☆23Jun 19, 2023Updated 3 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
violet-zct / swarm-distillation-zero-shot
View on GitHub
☆23Oct 15, 2022Updated 3 years ago
mhulden / pyfoma
View on GitHub
Python Finite-State Toolkit
☆68Updated this week
lecs-lab / polygloss
View on GitHub
A massively multilingual corpus and pretrained model for IGT
☆15Jun 4, 2026Updated last month
shrutirij / soft-gazetteers
View on GitHub
Code and data for the paper "Soft Gazetteers for Low-resource Named Entity Recognition"
☆20Nov 3, 2020Updated 5 years ago
CSTR-Edinburgh / qualtreats
View on GitHub
Qualtric or Qualtreat? Generate Qualtrics listening tests for Text-To-Speech evaluations.
☆36Jun 25, 2024Updated 2 years ago
SJTMusicTeam / SVS_system
View on GitHub
A system works on singing voice synthesis
☆79Jan 11, 2023Updated 3 years ago
parryc / interlinear
View on GitHub
Interlinear glossing with JS & CSS
☆20Aug 23, 2015Updated 10 years ago
CoEDL / elpis
View on GitHub
🙊 software for creating speech recognition models.
☆161Jun 2, 2024Updated 2 years ago
EveryVoiceTTS / EveryVoice
View on GitHub
The EveryVoice TTS Toolkit - Text To Speech for your language
☆43Updated this week
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
AmericasNLP / americasnlp2021
View on GitHub
☆46Jul 5, 2022Updated 4 years ago
harsh19 / Reasoning-Chains-MultihopQA
View on GitHub
Code and Data for our EMNLP 2020 paper titled 'Learning to Explain: Datasets and Models for Identifying Valid Reasoning Chains in Multiho…
☆28Feb 9, 2022Updated 4 years ago
livingmagic / nmt-with-bert-tf2
View on GitHub
NMT model with BERT in tensorflow 2.0
☆20Jul 24, 2019Updated 7 years ago
SJTMusicTeam / MusicGeneration
View on GitHub
☆10May 15, 2021Updated 5 years ago
HazyResearch / anchor-stability
View on GitHub
A study of the downstream instability of word embeddings
☆12Aug 23, 2022Updated 3 years ago
xinjli / transphone
View on GitHub
phoneme tokenizer and grapheme-to-phoneme model for 8k languages
☆174Jun 9, 2023Updated 3 years ago
masakhane-io / masakhane-mt
View on GitHub
Machine Translation for Africa
☆322Jun 14, 2022Updated 4 years ago
ChicagoHAI / decsum
View on GitHub
Implementation for Decision-focused Summarization (EMNLP2021)
☆12Mar 14, 2022Updated 4 years ago
chanind / amr-logic-converter
View on GitHub
Convert Abstract Meaning Representation (AMR) into first-order logic
☆17Aug 7, 2024Updated last year
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
xinjli / alqalign
View on GitHub
multilingual speech aligner
☆78Nov 19, 2023Updated 2 years ago
rnd2110 / MorphAGram
View on GitHub
A Language-Independent Unsupervised Morphological Segmentation Framework based on Adaptor Grammars
☆17Jun 14, 2024Updated 2 years ago
dmort27 / epitran
View on GitHub
A tool for transcribing orthographic text as IPA (International Phonetic Alphabet)
☆827Jun 18, 2026Updated last month
Helsinki-NLP / OpusFilter
View on GitHub
OpusFilter - Parallel corpus processing toolkit
☆115Jul 1, 2026Updated 3 weeks ago
CoderPat / learning-scaffold
View on GitHub
This is the official implementation for the paper "Learning to Scaffold: Optimizing Model Explanations for Teaching"
☆20May 19, 2022Updated 4 years ago
NRC-ILT / g2p
View on GitHub
Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!
☆203Updated this week
isi-nlp / carmel
View on GitHub
finite-state toolkit, EM and Bayesian (Gibbs sampling) training for FST and context-free derivation forests
☆15Jan 24, 2017Updated 9 years ago