newca12 / dictionary-builderLinks
Real world example to demonstrate advanced techniques to unmarshall very large xml document with very low memory footprint.
☆60Updated 3 months ago
Alternatives and similar repositories for dictionary-builder
Users that are interested in dictionary-builder are comparing it to the libraries listed below
Sorting:
- ☆47Updated 2 years ago
- Pure Rust port of CRFsuite: a fast implementation of Conditional Random Fields (CRFs)☆29Updated last month
- Java Wiktionary Library☆57Updated 2 years ago
- Downloads and imports Wikipedia page histories to a git repository☆35Updated 6 months ago
- Context-sensitive word embeddings with subwords. In Rust.☆87Updated last year
- Archived Python/Rust hybrid codebase - see divvun/kbdgen for v3☆26Updated 3 years ago
- Further developed as SyntaxDot: https://github.com/tensordot/syntaxdot☆13Updated 4 years ago
- A Rust library for reading and writing WARC files☆54Updated 7 months ago
- Multilingual implementation of RAKE algorithm for Rust☆34Updated 4 months ago
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆64Updated last year
- A general purpose processing framework for corpora of scientific documents☆64Updated this week
- This project aims to develop a parser for mediawiki markdown on the basis of Parsing Expression Grammars.☆14Updated 3 years ago
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆102Updated last month
- Authoring tool for interactive content☆21Updated last week
- wabac.js - Web Archive Browsing Augmentation Client☆108Updated last week
- common language and mathematics processing algorithms, in Rust☆26Updated last year
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆52Updated 3 years ago
- Ace by DAISY, an Accessibility Checker for EPUB☆86Updated 3 months ago
- Lexical data at Unicode☆68Updated 9 months ago
- Backend, IA-specific tools for crawling and processing the scholarly web. Content ends up in https://fatcat.wiki☆26Updated 10 months ago
- Text hyphenation for Rust☆54Updated last year
- Links on the web break all the time, robustify them!☆54Updated 4 years ago
- finalfusion embeddings in Rust☆102Updated last year
- Search engine benchmark (Tantivy, Lucene, PISA, ...)☆89Updated this week
- This is a new backend implementation of the ANNIS linguistic search and visualization system.☆17Updated last month
- Implementation of the Punkt sentence tokenizing algorithm in Rust.☆37Updated 5 years ago
- RDF parsers library☆86Updated 5 months ago
- A tool for creating pivot tables from the command line.☆14Updated 2 years ago
- a lib to edit Wikibase from NodeJS☆67Updated 3 weeks ago
- Python package for harvesting records from OAI-PMH provider(s).☆63Updated 2 years ago