fedelopez77 / langdetectLinks
A language detection software
☆64Updated 8 years ago
Alternatives and similar repositories for langdetect
Users that are interested in langdetect are comparing it to the libraries listed below
Sorting:
- 80x faster and 95% accurate language identification with Fasttext☆162Updated last year
- Efficient few-shot learning with cross-encoders.☆60Updated last year
- ☆58Updated last year
- Legal document similarity - Code, data, and models for the ICAIL 2021 paper "Evaluating Document Representations for Content-based Legal …☆32Updated 4 years ago
- 💬 Language Identification with Support for More Than 2000 Labels -- EMNLP 2023☆173Updated 2 weeks ago
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆63Updated last year
- Datasets collection and preprocessings framework for NLP extreme multitask learning☆189Updated 4 months ago
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.☆111Updated last year
- [EACL 2023] CoTEVer: Chain of Thought Prompting Annotation Toolkit for Explanation Verification☆41Updated 2 years ago
- multimodal document analysis☆166Updated 3 weeks ago
- Plug-and-play Search Interfaces with Pyserini and Hugging Face☆32Updated 2 years ago
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆75Updated last year
- A massively multilingual modern encoder language model☆113Updated last month
- A Multilingual Replicable Instruction-Following Model☆95Updated 2 years ago
- Source code and data for Like a Good Nearest Neighbor☆30Updated 10 months ago
- ☆81Updated last month
- This repository contains an easy and intuitive approach to use SetFit in combination with spaCy.☆80Updated 2 years ago
- Pretraining Efficiently on S2ORC!☆174Updated last year
- Flacuna was developed by fine-tuning Vicuna on Flan-mini, a comprehensive instruction collection encompassing various tasks. Vicuna is al…☆111Updated 2 years ago
- The data and the PyTorch implementation for the models and experiments in the paper "Exploiting Asymmetry for Synthetic Training Data Gen…☆64Updated 2 years ago
- [EMNLP 2024] A Retrieval Benchmark for Scientific Literature Search☆101Updated last year
- This project is a collection of fine-tuning scripts to help researchers fine-tune Qwen 2 VL on HuggingFace datasets.☆77Updated 4 months ago
- official code for EMNLP21 paper☆36Updated 3 years ago
- Starbucks: Improved Training for 2D Matryoshka Embeddings☆22Updated 5 months ago
- BigTranslate: Augmenting Large Language Models with Multilingual Translation Capability over 100 Languages☆229Updated 2 years ago
- Model implementation for the contextual embeddings project☆37Updated 6 months ago
- This Python module can be used to obtain antonyms, synonyms, hypernyms, hyponyms, homophones and definitions.☆125Updated last year
- This repository contains code used for our Multi Sentence Inference NAACL'22 paper.☆12Updated 2 years ago
- Official implementation of the paper "CoEdIT: Text Editing by Task-Specific Instruction Tuning" (EMNLP 2023)☆133Updated last year
- ☆53Updated last year