String-to-String Algorithms for Natural Language Processing
☆564Jan 25, 2026Updated last month
Alternatives and similar repositories for string2string
Users that are interested in string2string are comparing it to the libraries listed below
Sorting:
- Variable-order CRFs with structure learning☆17Aug 1, 2024Updated last year
- UDapter is a multilingual dependency parser that uses "contextual" adapters together with language-typology features for language-specifi…☆31Dec 5, 2022Updated 3 years ago
- Code for the paper "The Surprising Computational Power of Nondeterministic Stack RNNs" (DuSell and Chiang, 2023)☆19Mar 21, 2024Updated last year
- Efficient few-shot learning with Sentence Transformers☆2,688Dec 11, 2025Updated 2 months ago
- Salesforce open-source LLMs with 8k sequence length.☆725Jan 31, 2025Updated last year
- ☆141Mar 5, 2024Updated last year
- A Python library for calculating a large variety of metrics from text☆360Jan 30, 2026Updated last month
- data cleaning and curation for unstructured text☆329Aug 6, 2024Updated last year
- Interpretability for sequence generation models 🐛 🔍☆460Feb 2, 2026Updated last month
- Implementation of Cascaded Head-colliding Attention (ACL'2021)☆11Sep 16, 2021Updated 4 years ago
- ☆13Feb 7, 2023Updated 3 years ago
- A Structured Span Selector (NAACL 2022). A structured span selector with a WCFG for span selection tasks (coreference resolution, semanti…☆21Jul 11, 2022Updated 3 years ago
- ☆25Jan 22, 2024Updated 2 years ago
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality …☆106Feb 26, 2024Updated 2 years ago
- Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets☆4,884Updated this week
- Active Learning for Text Classification in Python☆639Feb 1, 2026Updated last month
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.☆2,915Updated this week
- Baguetter is a flexible, efficient, and hackable search engine library implemented in Python. It's designed for quickly benchmarking, imp…☆206Aug 31, 2024Updated last year
- Minimal keyword extraction with BERT☆4,116Feb 3, 2026Updated last month
- Top2Vec learns jointly embedded topic, document and word vectors.☆3,108Nov 14, 2024Updated last year
- Leveraging BERT and c-TF-IDF to create easily interpretable topics.☆7,426Feb 20, 2026Updated last week
- Repo for the Belebele dataset, a massively multilingual reading comprehension dataset.☆340Dec 18, 2024Updated last year
- ☆22Oct 26, 2020Updated 5 years ago
- DSPy: The framework for programming—not prompting—language models☆32,519Updated this week
- Public repo for the NeurIPS 2023 paper "Unlimiformer: Long-Range Transformers with Unlimited Length Input"☆1,064Mar 7, 2024Updated last year
- Source-to-Source Debuggable Derivatives in Pure Python☆15Jan 23, 2024Updated 2 years ago
- State-of-the-Art Text Embeddings☆18,323Updated this week
- Fast & Simple repository for pre-training and fine-tuning T5-style models☆1,017Aug 21, 2024Updated last year
- Robust recipes to align language models with human and AI preferences☆5,510Sep 8, 2025Updated 5 months ago
- A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.☆2,095Oct 16, 2025Updated 4 months ago
- A Unified Library for Parameter-Efficient and Modular Transfer Learning☆2,801Updated this week
- Fuzzy string matching, grouping, and evaluation.☆791Jul 10, 2025Updated 7 months ago
- source code of NAACL2021 "PCFGs Can Do Better: Inducing Probabilistic Context-Free Grammars with Many Symbols“ and ACL2021 main conferenc…☆52Mar 28, 2025Updated 11 months ago
- A BERT-based application for reusable text classification at scale☆38Jul 23, 2023Updated 2 years ago
- Constituency parser for English and Chinese, built on the RNNG and In-Order parsers with BERT☆38Apr 1, 2020Updated 5 years ago
- PyTorch implementation and pre-trained models for ASP - Autoregressive Structured Prediction with Language Models, EMNLP 22. https://arxi…☆107Jan 22, 2024Updated 2 years ago
- Annotated dataset of 100 works of fiction to support tasks in natural language processing and the computational humanities.☆371Dec 8, 2022Updated 3 years ago
- Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…☆3,859May 17, 2025Updated 9 months ago
- allennlp-light is a port of AllenNLP's core modules and nn portions into a standalone package with minimum dependencies☆56Oct 12, 2022Updated 3 years ago