A Corpus Data Retrieval Index using Lucene for Look-Ups
☆20Jun 10, 2026Updated this week
Alternatives and similar repositories for Krill
Users that are interested in Krill are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Benchmark scripts for comparing different tokenizers and sentence segmenters of German☆12Feb 27, 2023Updated 3 years ago
- Searching in-memory corpus with Corpus Query Language (CQL)☆19Dec 2, 2024Updated last year
- A tiny graph database engine written in C☆10May 9, 2014Updated 12 years ago
- Multi Tier Annotation Search☆12May 13, 2024Updated 2 years ago
- Python wrapper for the CWB to extract concordances and score frequency lists☆22May 11, 2026Updated last month
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- texrex web page cleaning & ClaraX random walk crawler☆11Dec 13, 2021Updated 4 years ago
- SMOR (Stuttgart Morphology) with alternative lemmatization component☆13Aug 10, 2023Updated 2 years ago
- OStatus Specification☆20Apr 11, 2015Updated 11 years ago
- Linguistic search for large annotated text corpora, based on Apache Lucene☆128Updated this week
- A package in C++ for character or word ngram analysis. It uses Ternary Search Tree instead of hashing table for faster ngram frequency co…☆20May 11, 2015Updated 11 years ago
- Coquery is a free corpus query tool for linguists, lexicographers, translators, and anybody who wishes to search and analyse a text corpu…☆19Jan 7, 2026Updated 5 months ago
- Editor for aligned parallel texts (personal desktop application).☆20Jan 15, 2026Updated 4 months ago
- ☆11Feb 13, 2026Updated 4 months ago
- A Python database interface for eXist-db☆15May 2, 2026Updated last month
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- WordNet-LMF formats☆27Feb 4, 2026Updated 4 months ago
- Morphosyntactic tagger for Norwegian bokmål and nynorsk☆29Jun 20, 2023Updated 2 years ago
- An OCR engine that works by finding pre-known letters in a word's image☆12Jul 29, 2019Updated 6 years ago
- Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipg…☆130Feb 5, 2026Updated 4 months ago
- A highly extensible plattform for conversion and manipulation of linguistic data between an unbound set of formats. Pepper can be used st…☆25Jan 3, 2025Updated last year
- HuCit KB: a knowledge base of classical texts and citable text units.☆11Nov 17, 2021Updated 4 years ago
- This is a new backend implementation of the ANNIS linguistic search and visualization system.☆19Jun 6, 2026Updated last week
- Repo of the Turing's Humanities & Data Science Discussion Group☆13Jul 21, 2022Updated 3 years ago
- Korpuslinguistik war noch nie so einfach...☆25Feb 18, 2026Updated 3 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- C++ implementation of a part-of-speech (POS) tagger using the lookahead tagging algorithm.☆12Jul 2, 2019Updated 6 years ago
- code and data used to build a training dataset for dragnet models☆10Nov 29, 2020Updated 5 years ago
- Morphological analyzer and lemmatizer for Latin.☆29May 22, 2026Updated 3 weeks ago
- ☆21Feb 7, 2016Updated 10 years ago
- Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/…☆29Apr 17, 2024Updated 2 years ago
- An advanced, extensible web front-end for the Manatee-open corpus search engine☆80Updated this week
- Modules used for separating articles in (historical) newspapers and similar documents. This repository is part of the European Union's Ho…☆22Sep 2, 2022Updated 3 years ago
- [WWW 2026] 🕸 GlotWeb: Web Indexing for Minority Languages☆17Apr 14, 2026Updated last month
- The Wikinflection Corpus, from the paper "Wikinflection Corpus: A (Better) Multilingual, Morpheme-Annotated Inflectional Corpus" (Metheni…☆12Dec 15, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- DroiD64 is a graphical file manager for D64, D67, D71, D80, D81, D82, D88, T64 and LNX files.☆12Feb 3, 2024Updated 2 years ago
- Benson turns a list of URLs into mp3s of the contents of each web page - take control over your reading backlog!☆16Oct 30, 2024Updated last year
- ☆13Oct 30, 2025Updated 7 months ago
- Training files for Greek cursive script (in early print)☆15May 26, 2021Updated 5 years ago
- A reddit bot that finds original publish dates on linked articles.☆10Nov 30, 2024Updated last year
- R-package for text mining with the Corpus Workbench (CWB) as backend☆48Mar 26, 2025Updated last year
- Create LLVM IR for Perl5☆41Jan 16, 2018Updated 8 years ago