A repository to learn basic data processing techniques (Wikipedia processing, feature selection), and use them for some basic Web query classification.
☆27Mar 1, 2022Updated 4 years ago
Alternatives and similar repositories for build-your-search-engine
Users that are interested in build-your-search-engine are comparing it to the libraries listed below
Sorting:
- Themis is a validation and processing library that helps you always make sure your data is correct.☆14Nov 18, 2022Updated 3 years ago
- A SapientML plugin of SapientMLGenerator☆11Dec 23, 2025Updated 2 months ago
- Pipes for MarkLogic DataHub is visual programming tool for MarkLogic Data Hub. It integrates with MarkLogic's Datahub and produces custom…☆14Jan 12, 2021Updated 5 years ago
- A Firefox extension to easily toggle between browser fonts and webpage fonts☆33Jan 21, 2026Updated last month
- A "framework" to work locally on your Custom Coded Action and execute it in the same context as HubSpot.☆10Feb 22, 2024Updated 2 years ago
- Create and modify Word documents with Python☆12Feb 16, 2026Updated 3 weeks ago
- XML/XSLT processing in the browser, supported by a Typescript library☆10Feb 18, 2025Updated last year
- ☆12Dec 9, 2022Updated 3 years ago
- An Email Spam Classifier project, helps you detect your spam email from correct email. Try it out here!☆12Jun 16, 2023Updated 2 years ago
- Writing SQL can be easier - pine makes it happen!☆12Updated this week
- Camunda Form Playground to simulate forms with input and output data.☆10Updated this week
- Your source for creating beautiful, consistent experiences across NICE☆17Feb 25, 2026Updated last week
- Svelte components for use with DatoCMS. Translated to Svelte from react-datocms☆10Jun 5, 2022Updated 3 years ago
- An XQuery 3.0 library for defining algebraic data types, and performing structural pattern matching on them.☆17Jun 30, 2012Updated 13 years ago
- Ruby API for Freebase.com☆37Jun 18, 2008Updated 17 years ago
- Django exception logger with AI suggestions on how to fix them☆10Feb 10, 2026Updated last month
- Code snippets to build Elementor Plugin widgets☆13Oct 13, 2022Updated 3 years ago
- This repository will contain python code that automates the georeferencing of any image that has a latitude and longitude associated with…☆11May 1, 2021Updated 4 years ago
- Extract links from Wikipedia pages to create a cross-document coreference dataset (multilingual support)☆11Apr 13, 2023Updated 2 years ago
- A simple solution for organizing your FastAPI endpoints☆14Jan 31, 2023Updated 3 years ago
- A simple DICOM Study Viewer based on the cornerstone platform☆13Sep 9, 2014Updated 11 years ago
- This repo is the final product of the above tutorial. It is a two-page website using Svelte and Sveltekit with content managed in Prismic…☆11Feb 23, 2022Updated 4 years ago
- Identifying and distinguishing spam SMS and Email using the multinomial Naïve Bayes model.☆14Jun 1, 2025Updated 9 months ago
- GitHub action that validates the syntax of selected RDF files in the repository☆12Feb 12, 2024Updated 2 years ago
- Vector functions and indexing for SQLite☆10Mar 26, 2023Updated 2 years ago
- Zig Vector Database!☆14Jan 30, 2026Updated last month
- Medical records you can copy and paste☆12Mar 3, 2023Updated 3 years ago
- The official repository for the CodeGym project: "Generalizable End-to-End Tool-Use RL with Synthetic CodeGym"☆23Oct 14, 2025Updated 4 months ago
- collaborative web tool to enrich content☆12Nov 13, 2011Updated 14 years ago
- Suite of generic Linked Data/SPARQL as well as LinkedDataHub-specific MCP tools☆38Feb 23, 2026Updated 2 weeks ago
- Doge Decimal Classification☆17Jan 2, 2026Updated 2 months ago
- JSON-LD parser that implements the RDFJS Sink interface using jsonld.js☆13Mar 2, 2026Updated last week
- [See what you mean!] fmSyntaxColorizer adds syntax highlighting to your FileMaker scripts and calculations. Based on the fantastic MBS-Pl…☆15Mar 3, 2023Updated 3 years ago
- ☆13Jan 7, 2023Updated 3 years ago
- Word embeddings trained on medical subreddits.☆10Jan 4, 2021Updated 5 years ago
- A small-but-powerful typesafe state machine, designed to handle large state graphs☆13Dec 6, 2022Updated 3 years ago
- Nushell configuration files☆11Jan 8, 2026Updated 2 months ago
- Parse styles in an XHTML document and expand as XML attributes (CSSa)☆10Jun 3, 2025Updated 9 months ago
- Earley based parsing tools for XSLT☆10Oct 8, 2020Updated 5 years ago