ffreemt / fast-langid
Detect language of a given text, fast
☆10Updated 10 months ago
Alternatives and similar repositories for fast-langid
Users that are interested in fast-langid are comparing it to the libraries listed below
Sorting:
- ✨ Split text by languages (e.g. 你喜欢看アニメ吗 -> 你喜欢看 | アニメ | 吗) for NLP tasks (e.g. parse, TTS). Powered by fasttext and budoux☆56Updated 2 months ago
- machine translate docx/txt via deepl and pyppeteer☆15Updated 2 years ago
- ☆13Updated 2 years ago
- ggml implementation of BERT Embedding☆25Updated last year
- Faster, modernized fork of the language identification tool langid.py☆55Updated 5 months ago
- phonetic similarity algorithms☆13Updated 6 years ago
- Large-scale exact string matching tool☆17Updated 2 months ago
- A model that predicts the punctuation of English, Italian, French and German texts.☆80Updated 2 years ago
- A performant high-throughput CPU-based API for Meta's No Language Left Behind (NLLB) using CTranslate2, hosted on Hugging Face Spaces.☆111Updated this week
- pkuseg多领域中文分词工具; The pkuseg toolkit for multi-domain Chinese word segmentation☆58Updated 8 months ago
- Scrape deepl using playwright☆9Updated last year
- Remove duplicate documents/videos/images via popular algorithms such as SimHash, SpotSig, Shingling, etc.☆18Updated last year
- 80x faster and 95% accurate language identification with Fasttext☆153Updated last year
- GGML implementation of BERT model with Python bindings and quantization.☆56Updated last year
- Port of Funasr's Paraformer model in C/C++☆31Updated 10 months ago
- Download full or partial git-lfs repos without temporarily using 2x disk space☆30Updated last year
- Library and command line utility to do approximate string matching of a source against a bitext index and get matched source and target.☆50Updated 3 weeks ago
- fastertransformer for codegeex model☆63Updated last year
- An even smaller speech recognizer / force aligner☆32Updated 5 months ago
- A cross platform implementation of Text-to-Speech based on ONNXRuntime.☆32Updated 2 years ago
- Semantic Search demo featuring UForm, USearch, UCall, and StreamLit, to visual and retrieve from image datasets, similar to "CLIP Retriev…☆45Updated last year
- A converter and basic tester for rwkv onnx☆42Updated last year
- Turn any OCR models into online inference API endpoint 🚀 🌖☆55Updated last month
- A fast RWKV Tokenizer written in Rust☆45Updated last month
- ONNX-compatible Fast SeamlessM4T—Massively Multilingual & Multimodal Machine Translation☆43Updated last year
- Character-level conversion between Hebrew text and Latin transliteration using deep learning - a demonstration of seq2seq training.☆13Updated last year
- Simply, faster, sentence-transformers☆142Updated 8 months ago
- ☆16Updated 11 months ago
- TTS Client for Coqui TTS server☆13Updated 2 years ago
- 🐍 Python bidding for the Hora Approximate Nearest Neighbor Search Algorithm library☆72Updated 3 years ago