String-to-String Algorithms for Natural Language Processing
☆566Jan 25, 2026Updated 2 months ago
Alternatives and similar repositories for string2string
Users that are interested in string2string are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Variable-order CRFs with structure learning☆17Aug 1, 2024Updated last year
- ☆141Mar 5, 2024Updated 2 years ago
- Code for the paper "The Surprising Computational Power of Nondeterministic Stack RNNs" (DuSell and Chiang, 2023)☆19Mar 21, 2024Updated 2 years ago
- data cleaning and curation for unstructured text☆329Aug 6, 2024Updated last year
- A Python library for calculating a large variety of metrics from text☆361Updated this week
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Efficient few-shot learning with Sentence Transformers☆2,699Dec 11, 2025Updated 3 months ago
- Salesforce open-source LLMs with 8k sequence length.☆726Jan 31, 2025Updated last year
- ☆25Jan 22, 2024Updated 2 years ago
- A Structured Span Selector (NAACL 2022). A structured span selector with a WCFG for span selection tasks (coreference resolution, semanti…☆21Jul 11, 2022Updated 3 years ago
- Toolkit for domain-specific information retrieval experimentation☆19Feb 24, 2026Updated last month
- Interpretability for sequence generation models 🐛 🔍☆462Mar 6, 2026Updated 2 weeks ago
- UDapter is a multilingual dependency parser that uses "contextual" adapters together with language-typology features for language-specifi…☆31Dec 5, 2022Updated 3 years ago
- ☆22Oct 26, 2020Updated 5 years ago
- A BERT-based application for reusable text classification at scale☆38Jul 23, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Convenient Text-to-Text Training for Transformers☆19Dec 10, 2021Updated 4 years ago
- mPLM-Sim: Better Cross-Lingual Similarity and Transfer in Multilingual Pretrained Language Models☆11Jan 19, 2024Updated 2 years ago
- Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets☆4,905Updated this week
- Annotated dataset of 100 works of fiction to support tasks in natural language processing and the computational humanities.☆372Dec 8, 2022Updated 3 years ago
- ☆67Mar 4, 2024Updated 2 years ago
- Code for Learning idiolectal style variation in online register☆10May 18, 2023Updated 2 years ago
- Leveraging BERT and c-TF-IDF to create easily interpretable topics.☆7,467Feb 20, 2026Updated last month
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.☆2,965Mar 16, 2026Updated last week
- Repo for the Belebele dataset, a massively multilingual reading comprehension dataset.☆339Dec 18, 2024Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Active Learning for Text Classification in Python☆638Mar 8, 2026Updated 2 weeks ago
- DSPy: The framework for programming—not prompting—language models☆33,038Updated this week
- Public repo for the NeurIPS 2023 paper "Unlimiformer: Long-Range Transformers with Unlimited Length Input"☆1,065Mar 7, 2024Updated 2 years ago
- Minimal keyword extraction with BERT☆4,131Feb 3, 2026Updated last month
- Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)☆392Nov 7, 2023Updated 2 years ago
- ☆13Feb 7, 2023Updated 3 years ago
- Top2Vec learns jointly embedded topic, document and word vectors.☆3,108Nov 14, 2024Updated last year
- Fast & Simple repository for pre-training and fine-tuning T5-style models☆1,018Aug 21, 2024Updated last year
- State-of-the-Art Text Embeddings☆18,427Mar 12, 2026Updated last week
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Fast, general, and tested differentiable structured prediction in PyTorch☆1,124Apr 20, 2022Updated 3 years ago
- ☆12Jan 29, 2021Updated 5 years ago
- A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.☆2,120Oct 16, 2025Updated 5 months ago
- Baguetter is a flexible, efficient, and hackable search engine library implemented in Python. It's designed for quickly benchmarking, imp…☆209Aug 31, 2024Updated last year
- MARNNs Can Learn Generalized Dyck Languages☆12Nov 11, 2019Updated 6 years ago
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality …☆105Feb 26, 2024Updated 2 years ago
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆21Feb 7, 2023Updated 3 years ago