fastlangid, the only language identification package that support cantonese (zh-yue), simplified (zh-hans) and traditional chinese (zh-hant)
☆43Dec 6, 2022Updated 3 years ago
Alternatives and similar repositories for fastlangid
Users that are interested in fastlangid are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆24Apr 19, 2026Updated last month
- ☆14Sep 10, 2021Updated 4 years ago
- Experiments with Hugging Face 🔬 🤗☆47Apr 18, 2026Updated last month
- A simple and humble image captioning application, based on a neural network built with Keras☆10Sep 23, 2022Updated 3 years ago
- Transferability of cross-lingual and cross-age speech emotion recognition☆21Jun 30, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆14Dec 3, 2019Updated 6 years ago
- Targetted language identifier, based on FastText and Hunspell.☆38Sep 4, 2025Updated 9 months ago
- fasttext with wheels and no external dependency, but only the predict method (<1MB)☆20Nov 23, 2024Updated last year
- Building and Using A Seed Corpus for the Human Language Project☆11Feb 9, 2018Updated 8 years ago
- ☆22Sep 26, 2022Updated 3 years ago
- Dataiku DSS plugin to detect languages, correct misspellings, and clean text data 🧼☆22Jan 29, 2026Updated 4 months ago
- Feature Decay Algorithms☆11Mar 5, 2014Updated 12 years ago
- Aligntune : A Modular Toolkit for Post Training Alignment of LLMs☆36Jun 5, 2026Updated last week
- Repo for the Wasabi datasets☆120Apr 10, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆10Feb 2, 2021Updated 5 years ago
- Professor forcing future code☆10Sep 22, 2018Updated 7 years ago
- A simple neural truecaser written in pytorch and allennlp.☆35Jun 17, 2024Updated last year
- Python scripts and datasets of the "Extremely Low-Resource Neural Machine Translation: A Case Study of Cantonese" project☆16Oct 28, 2022Updated 3 years ago
- The accompanying code and data for the Springer 2017 publication "What's missing in geographical parsing?" in Language Resources and Eval…☆18Oct 17, 2019Updated 6 years ago
- ☆28May 15, 2024Updated 2 years ago
- Fine-tuning Wav2Vec2.0 on Common Voice(zh-HK)☆16May 8, 2022Updated 4 years ago
- Multilingual and Multiculture Benchmark and LLM☆40May 18, 2026Updated 3 weeks ago
- Serve a 1x1 GIF pixel from an AWS lambda-powered endpoint☆13Sep 7, 2017Updated 8 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Ice is a rapid information extraction customizer☆15Apr 26, 2021Updated 5 years ago
- A duration-invariant audio-to-lyrics alignment pipeline with low memory footprint which segments long music recordings via a recursive bi…☆15Oct 13, 2022Updated 3 years ago
- This repository includes the masking vocabulary used in the ICLR 2021 spotlight PMI-Masking paper☆14Aug 9, 2021Updated 4 years ago
- Zero-shot Cross-lingual Task-Oriented Dialogue Systems (EMNLP 2019)☆24Nov 9, 2019Updated 6 years ago
- Search Engine Guided Non-Parametric Neural Machine Translation☆14Oct 23, 2017Updated 8 years ago
- ☆33Nov 7, 2019Updated 6 years ago
- Source code of our paper "Focus on the Target’s Vocabulary: Masked Label Smoothing for Machine Translation" @ ACL 2022☆13Apr 13, 2022Updated 4 years ago
- On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation (Findings of EMNLP 2021))☆13Nov 21, 2021Updated 4 years ago
- Implementation of "Modeling Past and Future for Neural Machine Translation"☆15Mar 16, 2018Updated 8 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Easy-to-use framework for evaluating cross-lingual consistency of factual knowledge (Supported LLaMA, BLOOM, mT5, RoBERTa, etc.) Paper he…☆28Aug 8, 2025Updated 10 months ago
- This is a repository dedicated for pre-trained acoustic models of Hong Kong Cantonese and Cantonese forced alignment.☆27Nov 14, 2024Updated last year
- BERT Tokenizer with vocabulary tailored for Cantonese☆23Oct 27, 2022Updated 3 years ago
- name2nat: a Python package for nationality prediction from a name☆118Oct 14, 2020Updated 5 years ago
- ☆14Nov 16, 2022Updated 3 years ago
- ☆20Jun 17, 2024Updated last year
- Official Implementation for the ICLR2023 paper "Fuzzy Alignments in Directed Acyclic Graph for Non-autoregressive Machine Translation"☆14Mar 1, 2023Updated 3 years ago