zafercavdar / fasttext-langdetectLinks
80x faster and 95% accurate language identification with Fasttext
☆161Updated last year
Alternatives and similar repositories for fasttext-langdetect
Users that are interested in fasttext-langdetect are comparing it to the libraries listed below
Sorting:
- ⚡️ 80x faster Fasttext language detection out of the box | Split text by language☆251Updated last month
- Simply, faster, sentence-transformers☆143Updated last year
- 💬 Language Identification with Support for More Than 2000 Labels -- EMNLP 2023☆163Updated 4 months ago
- Efficient few-shot learning with cross-encoders.☆59Updated last year
- FastFit ⚡ When LLMs are Unfit Use FastFit ⚡ Fast and Effective Text Classification with Many Classes☆212Updated last month
- Python API for https://vespa.ai, the open big data serving engine☆144Updated this week
- A Python Search Engine for Humans 🥸☆237Updated last year
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆255Updated 2 years ago
- ☆174Updated 7 months ago
- Python3 bindings for the Compact Language Detector v3 (CLD3)☆154Updated 2 years ago
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆177Updated 4 months ago
- Generalist and Lightweight Model for Text Classification☆163Updated 4 months ago
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.☆110Updated last year
- Official implementation of the paper "CoEdIT: Text Editing by Task-Specific Instruction Tuning" (EMNLP 2023)☆132Updated last year
- Instruct LLMs for flat and nested NER. Fine-tuning Llama and Mistral models for instruction named entity recognition. (Instruction NER)☆86Updated last year
- This repository contains an easy and intuitive approach to use SetFit in combination with spaCy.☆80Updated 2 years ago
- Guideline following Large Language Model for Information Extraction☆404Updated last year
- ☆369Updated last year
- The pipeline for the OSCAR corpus☆173Updated last year
- Repository for the paper "MultiNERD: A Multilingual, Multi-Genre and Fine-Grained Dataset for Named Entity Recognition (and Disambiguatio…☆45Updated last year
- Augmenty is an augmentation library based on spaCy for augmenting texts.☆156Updated last year
- Training open neural machine translation models☆380Updated 7 months ago
- Label data using HuggingFace's transformers and automatically get a prediction service☆193Updated 2 years ago
- A large-scale multilingual dataset for Information Retrieval. Thorough human-annotations across 18 diverse languages.☆196Updated last year
- Targetted language identifier, based on FastText and Hunspell.☆37Updated last month
- BigTranslate: Augmenting Large Language Models with Multilingual Translation Capability over 100 Languages☆228Updated last year
- ☆113Updated 10 months ago
- Pre-train Static Word Embeddings☆87Updated last month
- Few-shot Named Entity Recognition☆123Updated 3 years ago
- Fine-tune ModernBERT on a large Dataset with Custom Tokenizer Training☆73Updated last week