Python3 bindings for the Compact Language Detector v3 (CLD3)
☆155Jun 26, 2023Updated 2 years ago
Alternatives and similar repositories for pycld3
Users that are interested in pycld3 are comparing it to the libraries listed below
Sorting:
- ☆873May 24, 2023Updated 2 years ago
- ☆178Mar 28, 2025Updated 11 months ago
- Port of Google's language-detection library to Python.☆1,872Mar 3, 2025Updated last year
- Targetted language identifier, based on FastText and Hunspell.☆38Sep 4, 2025Updated 6 months ago
- Python Ecosystem for Power Hints and Tips, Issue Tracking☆29Dec 3, 2025Updated 3 months ago
- Language detection using Spacy and Fasttext☆57Dec 17, 2023Updated 2 years ago
- Blazing fast language detection using fastText model☆24Dec 18, 2022Updated 3 years ago
- The most accurate natural language detection library for Python, suitable for short text and mixed-language text☆1,641Nov 21, 2025Updated 3 months ago
- 80x faster and 95% accurate language identification with Fasttext☆165Jan 23, 2024Updated 2 years ago
- Residual Quantization Autoencoder, used for interpreting LLMs☆14Jan 1, 2025Updated last year
- A spaCy wrapper of OpenTapioca for named entity linking on Wikidata☆96Feb 5, 2026Updated last month
- ☆11Dec 25, 2020Updated 5 years ago
- Converts Twitter threads to Markdown files with proper reply indentation.☆11Dec 8, 2022Updated 3 years ago
- ☆10Jun 9, 2022Updated 3 years ago
- Yet Another Z39.50-powered Chatbot☆12Oct 9, 2023Updated 2 years ago
- Code for COLING 2020 paper "Improving Document-level Sentiment Analysis with User and Product Context"☆11Apr 13, 2022Updated 3 years ago
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.☆36Feb 5, 2026Updated last month
- Convert CSV files into Markdown formatted tables☆17Apr 20, 2021Updated 4 years ago
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality …☆106Feb 26, 2024Updated 2 years ago
- ☆20Nov 3, 2021Updated 4 years ago
- A Python library for calculating a large variety of metrics from text☆360Jan 30, 2026Updated last month
- 🖋 Resource and Tool for Writing System Identification (Unicode 17.0) -- LREC 2024☆21Feb 17, 2026Updated 2 weeks ago
- A Python utility for indexing file lines. Best demo honourable mention at ECIR 2024.☆23Nov 9, 2025Updated 3 months ago
- An open-source Notion-style WYSIWYG editor with AI-powered autocompletions.☆24Jul 13, 2023Updated 2 years ago
- LASER multilingual sentence embeddings as a pip package☆224Aug 11, 2023Updated 2 years ago
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆186Jun 6, 2025Updated 9 months ago
- Multilingual text (NLP) processing toolkit☆2,366Nov 10, 2023Updated 2 years ago
- Validate Luhn checksum, generate Luhn numbers☆17Jan 22, 2026Updated last month
- Data Collection System For NLP/Speech Recognition☆25Apr 20, 2021Updated 4 years ago
- Text span utilities for Rust and Python☆22Jan 3, 2023Updated 3 years ago
- Package for performing Reddit-based text analysis☆20Jan 23, 2019Updated 7 years ago
- A Test Collection of Computer Science Papers for Faceted Query by Example☆22Nov 28, 2021Updated 4 years ago
- Songs Lyrics Fetcher using Python on Frontend via Brython 🔥☆24Oct 5, 2021Updated 4 years ago
- NLP moudle for Golang☆13Jul 19, 2017Updated 8 years ago
- 🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy☆1,402Nov 7, 2025Updated 3 months ago
- A fully customisable language detection pipeline for spaCy☆93May 2, 2019Updated 6 years ago
- DKPro C4CorpusTools is a collection of tools for processing CommonCrawl corpus, including Creative Commons license detection, boilerplate…☆52Jun 12, 2020Updated 5 years ago
- Code of "Improving Machine Translation with Human Feedback: An Exploration of Quality Estimation as a Reward Model"☆23Jun 28, 2024Updated last year
- Language-Agnostic SEntence Representations☆3,659May 2, 2024Updated last year