tarekziade / mwcat
MediaWiki Categories Model
β13Updated last year
Alternatives and similar repositories for mwcat:
Users that are interested in mwcat are comparing it to the libraries listed below
- The NLP Bias Identification Toolkitβ36Updated last year
- Generate random passphrasesβ28Updated 3 weeks ago
- MoodCatπΌ classifies the mood of English sentences.β14Updated 2 years ago
- Datasette enrichment for analyzing row data using OpenAI's GPT modelsβ19Updated 10 months ago
- Small python package to measure OCR quality and other related metrics.β21Updated last year
- A polite and user-friendly downloader for Common Crawl dataβ36Updated last week
- LLM plugin for clustering embeddingsβ72Updated last year
- spaCy entry points for Curated Transformersβ27Updated 6 months ago
- Datasette pre-configured with useful plugins. Experimental alpha.β28Updated 9 months ago
- Extract networks of entities from journalistic reportingβ48Updated last year
- Data cleaning and validation functions for names, languages, identifiers, etc.β19Updated last week
- πΈ Train floret vectorsβ18Updated last year
- Datasette plugin for searching all searchable tables at onceβ24Updated 6 months ago
- LLM plugin for embeddings using sentence-transformersβ53Updated this week
- Potnia is an open-source Python library designed to convert Romanized transliterations of ancient texts into Unicode representations of tβ¦β16Updated 2 weeks ago
- tsellm: LLMs in SQLite and DuckDBβ22Updated 7 months ago
- spaCy extension for Visual Studio Codeβ29Updated 3 weeks ago
- Versatile Metrics Collection for Pythonβ19Updated last year
- A Python library for creating adversarial splitsβ13Updated 2 years ago
- A whirlwind tour of Common Crawl's data using Pythonβ17Updated 3 months ago
- a tool to snapshot sqlite databases you don't ownβ20Updated 5 months ago
- A set of crappy Python scripts to handle RSS in an Unix way.β46Updated 8 months ago
- Turn your git commit history into a scientific logβ45Updated last month
- Generate reports for spaCy models.β29Updated 2 years ago
- A Python library for defining rule-based overrides on messy dataβ13Updated 4 months ago
- A simple Python script to collate multiple PDFs into a single PDF.β26Updated 5 months ago
- A repository of instructions in French to fine-tune LLMsβ17Updated last year
- It's a cooler way to store simple linear models.β28Updated 8 months ago
- Datasette plugin providing instructions for exporting data to Jupyter or Observableβ12Updated last year
- DocumentCloud's back end source code - Please report bugs, issues and feature requests to info@documentcloud.orgβ37Updated this week