ffreemt / fast-langidLinks
Detect language of a given text, fast
☆10Updated last year
Alternatives and similar repositories for fast-langid
Users that are interested in fast-langid are comparing it to the libraries listed below
Sorting:
- ✨ Split text by languages (e.g. 你喜欢看アニメ吗 -> 你喜欢看 | アニメ | 吗) for NLP tasks (e.g. parse, TTS). Powered by fasttext and budoux☆67Updated 4 months ago
- 80x faster and 95% accurate language identification with Fasttext☆164Updated 2 years ago
- ⚡️ 80x faster Fasttext language detection out of the box | Split text by language☆285Updated 4 months ago
- ggml implementation of BERT Embedding☆26Updated 2 years ago
- Easy-Translate is a script for translating large text files with a SINGLE COMMAND. Easy-Translate is designed to be as easy as possible f…☆226Updated last year
- A performant high-throughput CPU-based API for Meta's No Language Left Behind (NLLB) using CTranslate2, hosted on Hugging Face Spaces.☆138Updated last week
- Faster, modernized fork of the language identification tool langid.py☆60Updated last year
- Local cross-platform machine translation GUI, based on CTranslate2☆99Updated 2 years ago
- machine translate docx/txt via deepl and pyppeteer☆15Updated 3 years ago
- GGML implementation of BERT model with Python bindings and quantization.☆58Updated last year
- A sentence segmentation library with wide language support optimized for speed and utility.☆86Updated 3 weeks ago
- pkuseg多领域中文分词工具; The pkuseg toolkit for multi-domain Chinese word segmentation☆69Updated 6 months ago
- ONNX-compatible Fast SeamlessM4T—Massively Multilingual & Multimodal Machine Translation☆43Updated 2 years ago
- A model that predicts the punctuation of English, Italian, French and German texts.☆83Updated 2 years ago
- A simple Python package to easily use Meta's Massively Multilingual Speech (MMS) project☆54Updated 2 years ago
- Deploy an API that pulls data from duckduckgo search engine.☆16Updated 9 months ago
- Simply, faster, sentence-transformers☆144Updated last year
- auto fix invalid json / 自动修复补全残缺无效的 JSON☆60Updated 2 years ago
- Python module that identifies Chinese text as being Simplified or Traditional☆105Updated last year
- Open Source Text Embedding Models with OpenAI Compatible API☆167Updated last year
- Port of Funasr's Paraformer model in C/C++☆39Updated last year
- Training open neural machine translation models☆398Updated 3 weeks ago
- This is an example of search videos using jina☆24Updated 3 years ago
- Check for multiple patterns in a single string at the same time: a fast Aho-Corasick algorithm for Python☆218Updated this week
- 🔧 Repair JSON!Solution for JSON Anomalies from LLMs.☆350Updated last week
- A streamlined, user-friendly JSON streaming preprocessor, crafted in Python.☆115Updated last year
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆255Updated 3 years ago
- Meta's "No Language Left Behind" models served as web app and REST API☆254Updated 8 months ago
- Faster access to Tesseract-OCR from Python☆13Updated 4 years ago
- A small seq2seq punctuator tool based on DistilBERT☆53Updated last year