fastlangid, the only language identification package that support cantonese (zh-yue), simplified (zh-hans) and traditional chinese (zh-hant)
β43Dec 6, 2022Updated 3 years ago
Alternatives and similar repositories for fastlangid
Users that are interested in fastlangid are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- β23Feb 3, 2026Updated 2 months ago
- Experiments with Hugging Face π¬ π€β46Mar 18, 2026Updated 3 weeks ago
- A simple and humble image captioning application, based on a neural network built with Kerasβ10Sep 23, 2022Updated 3 years ago
- β14Dec 3, 2019Updated 6 years ago
- Supplemental material for the paper "Towards Automatically Correcting Tapped Beat Annotations for Music Recordings"β20May 6, 2021Updated 4 years ago
- DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Targetted language identifier, based on FastText and Hunspell.β38Sep 4, 2025Updated 7 months ago
- Building and Using A Seed Corpus for the Human Language Projectβ11Feb 9, 2018Updated 8 years ago
- β22Sep 26, 2022Updated 3 years ago
- Dataiku DSS plugin to detect languages, correct misspellings, and clean text data π§Όβ22Jan 29, 2026Updated 2 months ago
- Memcached module for Nest framework (node.js) πβ19Apr 3, 2026Updated last week
- Aligntune : A Modular Toolkit for Post Training Alignment of LLMsβ36Mar 23, 2026Updated 3 weeks ago
- Repo for the Wasabi datasetsβ117Apr 10, 2025Updated last year
- Code of "Instruction Multi-Constraint Molecular Generation Using a Teacher-Student Large Language Model"β14Jul 8, 2025Updated 9 months ago
- β10Feb 2, 2021Updated 5 years ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Official reposity for paper "High-Dimension Human Value Representation in Large Language Models" (NAACL'25 Main)β23Jul 9, 2024Updated last year
- Python scripts and datasets of the "Extremely Low-Resource Neural Machine Translation: A Case Study of Cantonese" projectβ16Oct 28, 2022Updated 3 years ago
- A simple neural truecaser written in pytorch and allennlp.β33Jun 17, 2024Updated last year
- Extract files from Kirikiri Z engine.β23Oct 4, 2025Updated 6 months ago
- Cross Sentence Neural Machine Translationβ11Mar 26, 2018Updated 8 years ago
- Fine-tuning Wav2Vec2.0 on Common Voice(zh-HK)β16May 8, 2022Updated 3 years ago
- NusaWrites is an in-depth analysis of corpora collection strategy and a comprehensive language modeling benchmark for underrepresented anβ¦β28Sep 27, 2024Updated last year
- The Cantonese Wordnetβ14Dec 4, 2023Updated 2 years ago
- β11Jun 23, 2022Updated 3 years ago
- Proton VPN Special Offer - Get 70% off β’ AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- GC4LM: A Colossal (Biased) language model for Germanβ13May 2, 2021Updated 4 years ago
- Implementations of Amazon SageMaker-compatible custom containers for training.β25Jan 3, 2021Updated 5 years ago
- Serve a 1x1 GIF pixel from an AWS lambda-powered endpointβ13Sep 7, 2017Updated 8 years ago
- β12Nov 5, 2024Updated last year
- Ice is a rapid information extraction customizerβ15Apr 26, 2021Updated 4 years ago
- A duration-invariant audio-to-lyrics alignment pipeline with low memory footprint which segments long music recordings via a recursive biβ¦β15Oct 13, 2022Updated 3 years ago
- The English-Vietnamese Bilingual Corpus (EVBCorpus) is a collection of English and Vietnamese parallel translations and bitexts.β51Jul 12, 2019Updated 6 years ago
- Lexically Constrained Neural Machine Translation with Levenshtein Transformerβ40Jul 14, 2020Updated 5 years ago
- This repository includes the masking vocabulary used in the ICLR 2021 spotlight PMI-Masking paperβ14Aug 9, 2021Updated 4 years ago
- DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Zero-shot Cross-lingual Task-Oriented Dialogue Systems (EMNLP 2019)β24Nov 9, 2019Updated 6 years ago
- Search Engine Guided Non-Parametric Neural Machine Translationβ14Oct 23, 2017Updated 8 years ago
- β33Nov 7, 2019Updated 6 years ago
- The benchmark and datasets of the ICML 2024 paper "VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual Cβ¦β17May 27, 2024Updated last year
- On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation (Findings of EMNLP 2021))β13Nov 21, 2021Updated 4 years ago
- Code for Findings of ACL 2023 paper "Improving Zero-shot Multilingual Neural Machine Translation by Leveraging Cross-lingual Consistency β¦β10Jul 18, 2023Updated 2 years ago
- Implementation of "Modeling Past and Future for Neural Machine Translation"β15Mar 16, 2018Updated 8 years ago