ffreemt / fast-langid
Detect language of a given text, fast
☆9Updated 8 months ago
Alternatives and similar repositories for fast-langid:
Users that are interested in fast-langid are comparing it to the libraries listed below
- Library and command line utility to do approximate string matching of a source against a bitext index and get matched source and target.☆50Updated 3 months ago
- pkuseg多领域中文分词工具; The pkuseg toolkit for multi-domain Chinese word segmentation☆57Updated 6 months ago
- machine translate docx/txt via deepl and pyppeteer☆15Updated 2 years ago
- 中文标点符号模型,可以给文本添加标点符号。☆140Updated 3 months ago
- A convenient Chinese word segmentation tool 简便中文分词器☆46Updated 2 months ago
- A cross platform implementation of Text-to-Speech based on ONNXRuntime.☆32Updated last year
- Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.☆10Updated 8 months ago
- A model that predicts the punctuation of English, Italian, French and German texts.☆80Updated 2 years ago
- 中文逆文本正则化 (Chinese ITN, Chinese Inverse Text Normalization) ,即将文本中的中文数字转为阿拉伯数字。☆11Updated last year
- Targetted language identifier, based on FastText and Hunspell.☆34Updated last month
- phonetic similarity algorithms☆12Updated 6 years ago
- Faster, modernized fork of the language identification tool langid.py☆55Updated 4 months ago
- 使用 pinyin-data 和 phrase-pinyin-data 中的拼音数据文件覆盖 pypinyin 中的内置拼音数据☆56Updated 2 months ago
- Large-scale exact string matching tool☆15Updated 3 weeks ago
- Fast whitespace correction with Transformers☆15Updated 11 months ago
- Minimal example of using a traced huggingface transformers model with libtorch☆35Updated 4 years ago
- ☆55Updated last year
- Scrape deepl using playwright☆9Updated last year
- A small seq2seq punctuator tool based on DistilBERT☆50Updated 3 months ago
- Bilingual sentence similarity classifier using Tensorflow☆21Updated 5 years ago
- OpusCleaner is a web interface that helps you select, clean and schedule your data for training machine translation models.☆51Updated 2 months ago
- ☆57Updated 2 years ago
- Implementation of Z-BERT-A: a zero-shot pipeline for unknown intent detection.☆39Updated last year
- pinyintokenizer, 拼音分词器,将连续的拼音切分为单字拼音列表。☆29Updated last month
- A collection of basic python modules for spoken natural language processing☆56Updated 5 years ago
- Source code for the Apple reproduction☆32Updated 3 years ago
- SenseVoice-python: A enterprise-grade open source multi-language asr system from funasr opensource with onnxruntime☆87Updated 6 months ago
- TTS Client for Coqui TTS server☆13Updated 2 years ago
- 80x faster and 95% accurate language identification with Fasttext☆151Updated last year
- Port of Funasr's Paraformer model in C/C++☆30Updated 9 months ago