A Corpus Data Retrieval Index using Lucene for Look-Ups
☆20May 13, 2026Updated last week
Alternatives and similar repositories for Krill
Users that are interested in Krill are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Translation of query languages to serialized KoralQuery protocol☆15May 14, 2026Updated last week
- Benchmark scripts for comparing different tokenizers and sentence segmenters of German☆12Feb 27, 2023Updated 3 years ago
- A tiny graph database engine written in C☆10May 9, 2014Updated 12 years ago
- Multi Tier Annotation Search☆12May 13, 2024Updated 2 years ago
- JSON Web Token in the Mojolicious style☆16Oct 15, 2024Updated last year
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- texrex web page cleaning & ClaraX random walk crawler☆11Dec 13, 2021Updated 4 years ago
- SMOR (Stuttgart Morphology) with alternative lemmatization component☆13Aug 10, 2023Updated 2 years ago
- ☆12Jun 8, 2021Updated 4 years ago
- Comparing warc files☆17Feb 21, 2019Updated 7 years ago
- GC4LM: A Colossal (Biased) language model for German☆13May 2, 2021Updated 5 years ago
- A package in C++ for character or word ngram analysis. It uses Ternary Search Tree instead of hashing table for faster ngram frequency co…☆20May 11, 2015Updated 11 years ago
- KB data lab☆10Dec 8, 2020Updated 5 years ago
- RDF river plugin for harvesting metadata from Jena TDB, SPARQL endpoints or plain RDF files into Elasticsearch☆10May 20, 2022Updated 4 years ago
- Editor for aligned parallel texts (personal desktop application).☆20Jan 15, 2026Updated 4 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ☆11Feb 13, 2026Updated 3 months ago
- A Python database interface for eXist-db☆15May 2, 2026Updated 3 weeks ago
- WordNet-LMF formats☆27Feb 4, 2026Updated 3 months ago
- Danish Semantic analysis☆18Sep 24, 2020Updated 5 years ago
- Neural models for detecting and masking personal information from texts☆16Nov 25, 2022Updated 3 years ago
- Morphosyntactic tagger for Norwegian bokmål and nynorsk☆29Jun 20, 2023Updated 2 years ago
- Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipg…☆130Feb 5, 2026Updated 3 months ago
- A highly extensible plattform for conversion and manipulation of linguistic data between an unbound set of formats. Pepper can be used st…☆25Jan 3, 2025Updated last year
- HuCit KB: a knowledge base of classical texts and citable text units.☆11Nov 17, 2021Updated 4 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Core libraries by the PRImA Research Lab☆16Jul 30, 2024Updated last year
- This is a new backend implementation of the ANNIS linguistic search and visualization system.☆19Apr 18, 2026Updated last month
- Norwegian Speech Transformer Models☆19Mar 26, 2026Updated last month
- A visualisation of the Princeton WordNet database☆15Mar 21, 2015Updated 11 years ago
- Collection de romans français du dix-huitième siècle (1751-1800) / Collection of Eighteenth-Century French Novels (1751-1800)☆23Apr 23, 2024Updated 2 years ago
- A trend viewer written in Python/JavaScript☆21Nov 15, 2024Updated last year
- C++ implementation of a part-of-speech (POS) tagger using the lookahead tagging algorithm.☆12Jul 2, 2019Updated 6 years ago
- shoco is a compressor for small text strings. [Not maintained].☆11Sep 4, 2019Updated 6 years ago
- code and data used to build a training dataset for dragnet models☆10Nov 29, 2020Updated 5 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Kaldi code for doing DNN with tensorflow☆13Feb 8, 2016Updated 10 years ago
- Hexatomic is an extensible software for deep multi-layer annotation of linguistic corpora☆17Nov 27, 2024Updated last year
- Morphological analyzer and lemmatizer for Latin.☆29Apr 14, 2026Updated last month
- ☆21Feb 7, 2016Updated 10 years ago
- Small string compression using smaz compression algorithm. Fast, because it's in C. Supports Python 3+☆13Oct 18, 2025Updated 7 months ago
- Modules used for separating articles in (historical) newspapers and similar documents. This repository is part of the European Union's Ho…☆22Sep 2, 2022Updated 3 years ago
- [WWW 2026] 🕸 GlotWeb: Web Indexing for Minority Languages☆17Apr 14, 2026Updated last month