🐎 A fast implementation of the Aho-Corasick algorithm using the compact double-array data structure. (Python wrapper for daachorse)
☆20Mar 15, 2025Updated 11 months ago
Alternatives and similar repositories for python-daachorse
Users that are interested in python-daachorse are comparing it to the libraries listed below
Sorting:
- Includes a file with zstd compression in Rust☆13Feb 17, 2023Updated 3 years ago
- 🌳 A compressed rank/select dictionary exploiting approximate linearity and repetitiveness.☆15Jun 28, 2022Updated 3 years ago
- Edit and create Kubernetes job from cronjob template using your EDITOR☆18Apr 8, 2025Updated 10 months ago
- Awesome List of Sources of Japanese Censored Words☆19Sep 11, 2022Updated 3 years ago
- Rust binding of primitiv☆20Jun 3, 2018Updated 7 years ago
- Yada is a yet another double-array trie library aiming for fast search and compact data representation.☆45Feb 25, 2024Updated 2 years ago
- ☆19Jan 17, 2023Updated 3 years ago
- 『機械学習による検索ランキング改善ガイド』のサンプルコードのリポジトリ☆22Aug 3, 2023Updated 2 years ago
- Discovering Universal Geometry in Embeddings with ICA (Published in EMNLP 2023)☆20Jun 17, 2025Updated 8 months ago
- 🐎 A fast implementation of the Aho-Corasick algorithm using the compact double-array data structure in Rust.☆243Jan 26, 2026Updated last month
- ☆11Mar 13, 2025Updated 11 months ago
- 🦞 Rust library of natural language dictionaries using character-wise double-array tries.☆36Jan 13, 2025Updated last year
- Kannon is a wrapper for the gokart library that allows gokart tasks to be easily executed in a distributed and parallel manner on multipl…☆26Jan 17, 2025Updated last year
- Testing tool to verify the search qualities of the Elasticsearch indices☆29Jan 8, 2023Updated 3 years ago
- 🛥 Vaporetto: Very accelerated pointwise prediction based tokenizer☆251Feb 7, 2026Updated 3 weeks ago
- Implementing SimCSE using KR-BERT☆31Jul 23, 2021Updated 4 years ago
- eskeeper synchronizes index and alias with configuration files while ensuring idempotency.☆37Aug 23, 2022Updated 3 years ago
- Generate boilerplates for layered architecture by your templates.☆13Dec 27, 2019Updated 6 years ago
- A framework for evaluating Machine Translation models.☆12May 26, 2025Updated 9 months ago
- Repository of ACL2023 paper: Unbalanced Optimal Transport for Unbalanced Word Alignment☆38Sep 13, 2023Updated 2 years ago
- Information and data related to the ProtestNews shared task at CASE @ ACL-IJCNLP 2021 workshop☆43Oct 14, 2022Updated 3 years ago
- GitコマンドのヘルプをExpandHelpで☆11Dec 5, 2018Updated 7 years ago
- LBFGS optimization algorithm ported from liblbfgs☆12Nov 25, 2022Updated 3 years ago
- Modeling Harmonic Complexity using two models of Conditional Variational Autoencoders - MSc. Thesis☆10May 16, 2023Updated 2 years ago
- ☆10Jun 22, 2020Updated 5 years ago
- Python tool (and library) to sign/verify files with RSA, Ed25519, or EC/secp256k1 keys☆14Apr 16, 2021Updated 4 years ago
- A simple library for running complex DAG of async tasks☆14Mar 26, 2025Updated 11 months ago
- Rust library of fast and compact string dictionary using Front-Coding☆12Mar 27, 2022Updated 3 years ago
- Code for the paper "Closing the Curious Case of Neural Text Degeneration"☆11Apr 9, 2025Updated 10 months ago
- BLEU Score in Rust☆12Updated this week
- Infrastructure to run programs written in high-level languages on top of the Database Stream Processor (DBSP) runtime.☆16Jun 17, 2022Updated 3 years ago
- Code and Dataset release of "Carpe Diem: On the Evaluation of World Knowledge in Lifelong Language Models" (NAACL 2024)☆10Oct 16, 2024Updated last year
- Tokyo Metropolitan University Paraphrase Corpus (TMUP)☆11Jun 12, 2017Updated 8 years ago
- structured attention encoder☆13Jun 6, 2018Updated 7 years ago
- ☆10Jun 5, 2025Updated 8 months ago
- suffix array construction and searching algorithms for in-memory binary data.☆12Sep 10, 2022Updated 3 years ago
- A lightweight Snowflake emulator built with Go and DuckDB for local development and testing☆24Jan 19, 2026Updated last month
- ☆12Jan 10, 2017Updated 9 years ago
- A method for evaluating the high-level coherence of machine-generated texts. Identifies high-level coherence issues in transformer-based …☆11Mar 18, 2023Updated 2 years ago