zafercavdar / fasttext-langdetectLinks
80x faster and 95% accurate language identification with Fasttext
☆163Updated last year
Alternatives and similar repositories for fasttext-langdetect
Users that are interested in fasttext-langdetect are comparing it to the libraries listed below
Sorting:
- Simply, faster, sentence-transformers☆143Updated last year
- 💬 Language Identification with Support for More Than 2000 Labels -- EMNLP 2023☆175Updated 3 weeks ago
- FastFit ⚡ When LLMs are Unfit Use FastFit ⚡ Fast and Effective Text Classification with Many Classes☆214Updated 2 months ago
- A Python Search Engine for Humans 🥸☆241Updated last year
- Python API for https://vespa.ai, the open big data serving engine☆151Updated this week
- Efficient few-shot learning with cross-encoders.☆60Updated last year
- Python3 bindings for the Compact Language Detector v3 (CLD3)☆155Updated 2 years ago
- This repository contains an easy and intuitive approach to use SetFit in combination with spaCy.☆80Updated 2 years ago
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆255Updated 3 years ago
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.☆111Updated last year
- PyTorch-IE: State-of-the-art Information Extraction in PyTorch☆77Updated 2 months ago
- Official implementation of the paper "CoEdIT: Text Editing by Task-Specific Instruction Tuning" (EMNLP 2023)☆133Updated last year
- Generalist and Lightweight Model for Text Classification☆166Updated last week
- Augmenty is an augmentation library based on spaCy for augmenting texts.☆156Updated last year
- ☆371Updated 2 years ago
- Incorporating VIsual LAyout Structures for Scientific Text Classification☆179Updated 2 years ago
- multimodal document analysis☆166Updated 3 weeks ago
- Baguetter is a flexible, efficient, and hackable search engine library implemented in Python. It's designed for quickly benchmarking, imp…☆199Updated last year
- Zero and Few shot named entity & relationships recognition☆394Updated 2 months ago
- The pipeline for the OSCAR corpus☆174Updated last month
- Targetted language identifier, based on FastText and Hunspell.☆38Updated 3 months ago
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆63Updated last year
- Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2…☆70Updated 2 years ago
- ☆176Updated 8 months ago
- RaKUn 2.0 - A fast keyword detection algorithm☆68Updated 4 months ago
- Few-shot Named Entity Recognition☆122Updated 3 years ago
- Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: …☆338Updated 2 years ago
- Completion After Prompt Probability. Make your LLM make a choice☆82Updated last year
- 🔢 Work with static vector models☆35Updated 7 months ago
- Guideline following Large Language Model for Information Extraction☆416Updated last year