fastlangid, the only language identification package that support cantonese (zh-yue), simplified (zh-hans) and traditional chinese (zh-hant)
β43Dec 6, 2022Updated 3 years ago
Alternatives and similar repositories for fastlangid
Users that are interested in fastlangid are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- β14Sep 10, 2021Updated 4 years ago
- Experiments with Hugging Face π¬ π€β46Apr 18, 2026Updated last month
- A simple and humble image captioning application, based on a neural network built with Kerasβ10Sep 23, 2022Updated 3 years ago
- End to end Machine Learning with Amazon SageMakerβ43Feb 16, 2024Updated 2 years ago
- Transferability of cross-lingual and cross-age speech emotion recognitionβ21Jun 30, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- β14Dec 3, 2019Updated 6 years ago
- Supplemental material for the paper "Towards Automatically Correcting Tapped Beat Annotations for Music Recordings"β20May 6, 2021Updated 5 years ago
- Targetted language identifier, based on FastText and Hunspell.β38Sep 4, 2025Updated 8 months ago
- An audio and transcribed corpus of contemporary Hong Kong Cantoneseβ40Dec 30, 2020Updated 5 years ago
- fasttext with wheels and no external dependency, but only the predict method (<1MB)β19Nov 23, 2024Updated last year
- Building and Using A Seed Corpus for the Human Language Projectβ11Feb 9, 2018Updated 8 years ago
- A lightweight Python library for running TTS models with a unified API.β20Feb 18, 2025Updated last year
- β22Sep 26, 2022Updated 3 years ago
- Dataiku DSS plugin to detect languages, correct misspellings, and clean text data π§Όβ22Jan 29, 2026Updated 3 months ago
- Deploy open-source AI quickly and easily - Special Bonus Offer β’ AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Aligntune : A Modular Toolkit for Post Training Alignment of LLMsβ36Apr 29, 2026Updated 3 weeks ago
- Repo for the Wasabi datasetsβ119Apr 10, 2025Updated last year
- Code of "Instruction Multi-Constraint Molecular Generation Using a Teacher-Student Large Language Model"β14Jul 8, 2025Updated 10 months ago
- β10Feb 2, 2021Updated 5 years ago
- Official reposity for paper "High-Dimension Human Value Representation in Large Language Models" (NAACL'25 Main)β23Jul 9, 2024Updated last year
- A simple neural truecaser written in pytorch and allennlp.β34Jun 17, 2024Updated last year
- The source code for 'Noisy-Labeled NER with Confidence Estimation' accepted by NAACL 2021β36May 8, 2021Updated 5 years ago
- Cross Sentence Neural Machine Translationβ11Mar 26, 2018Updated 8 years ago
- Fine-tuning Wav2Vec2.0 on Common Voice(zh-HK)β16May 8, 2022Updated 4 years ago
- GPUs on demand by Runpod - Special Offer Available β’ AdRun AI, ML, and HPC workloads on powerful cloud GPUsβwithout limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- The Cantonese Wordnetβ14Dec 4, 2023Updated 2 years ago
- Multilingual and Multiculture Benchmark and LLMβ36Updated this week
- β11Jun 23, 2022Updated 3 years ago
- Ice is a rapid information extraction customizerβ15Apr 26, 2021Updated 5 years ago
- β12Nov 5, 2024Updated last year
- Build, train & debug, and deploy & monitor with Amazon SageMakerβ119Aug 9, 2022Updated 3 years ago
- A duration-invariant audio-to-lyrics alignment pipeline with low memory footprint which segments long music recordings via a recursive biβ¦β15Oct 13, 2022Updated 3 years ago
- The English-Vietnamese Bilingual Corpus (EVBCorpus) is a collection of English and Vietnamese parallel translations and bitexts.β51Jul 12, 2019Updated 6 years ago
- Lexically Constrained Neural Machine Translation with Levenshtein Transformerβ40Jul 14, 2020Updated 5 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI β’ AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- This repository includes the masking vocabulary used in the ICLR 2021 spotlight PMI-Masking paperβ14Aug 9, 2021Updated 4 years ago
- Zero-shot Cross-lingual Task-Oriented Dialogue Systems (EMNLP 2019)β24Nov 9, 2019Updated 6 years ago
- Search Engine Guided Non-Parametric Neural Machine Translationβ14Oct 23, 2017Updated 8 years ago
- The benchmark and datasets of the ICML 2024 paper "VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual Cβ¦β17May 27, 2024Updated last year
- Model training tutorials for the Stanza Python NLP Libraryβ41Jul 12, 2022Updated 3 years ago
- Source code of our paper "Focus on the Targetβs Vocabulary: Masked Label Smoothing for Machine Translation" @ ACL 2022β13Apr 13, 2022Updated 4 years ago
- On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation (Findings of EMNLP 2021))β13Nov 21, 2021Updated 4 years ago