NLP tool for scraping text from a corpus of PDF files, embedding the sentences in the text and finding semantically similar sentences to a given search query.
☆37Jun 22, 2022Updated 3 years ago
Alternatives and similar repositories for airflow-pdf2embeddings
Users that are interested in airflow-pdf2embeddings are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Interactive notebooks containing demonstration code of the splink library☆41Mar 4, 2026Updated last month
- A web application that provides a LLM powered chat experience based on GOV.UK content.☆13Updated this week
- Strip output from iPython notebooks☆22Sep 6, 2015Updated 10 years ago
- Python version of dbtools☆12Jul 30, 2025Updated 8 months ago
- A thin wrapper around the AJV JSON Validator for Python☆12May 5, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Digital Humanities course site☆21Nov 22, 2021Updated 4 years ago
- It is a project designed to make ADB(Android Debug Bridge) and its Fastboot element easier to use with a graphical interface.☆30Mar 13, 2026Updated last month
- A digital edition of the 24 Probstücke of the Oberclasse by Johann Mattheson.☆11Mar 25, 2026Updated 2 weeks ago
- R package for common Department for Education analysis tasks☆14Mar 31, 2026Updated 2 weeks ago
- Jupyter notebook that contains the workflow for cleaning scraped HTML sites for NLP in Python☆10Sep 3, 2020Updated 5 years ago
- Extract structured data from free text using large language models☆18Apr 2, 2026Updated last week
- Run the Microsoft Word "Compare" tool from a CLI☆11Sep 6, 2018Updated 7 years ago
- Urdu Summary Corpus and Software Tools Version 1.0☆13Oct 16, 2022Updated 3 years ago
- ☆10Jan 24, 2023Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- DASD's coding principles for analytical projects☆16Oct 9, 2023Updated 2 years ago
- App that convert any YouTube video to text. Created for Learn Build Teach Hackathon 2022☆13Feb 6, 2026Updated 2 months ago
- ☆11Sep 27, 2022Updated 3 years ago
- Framework for Oxygen XML Author for Digital Scholarly Editions☆14May 23, 2025Updated 10 months ago
- ALignment Transformation EnviRonment☆11May 8, 2018Updated 7 years ago
- Real-time structure motif searching in protein 3D structures using an inverted index strategy☆14Jul 19, 2025Updated 8 months ago
- Clone a voice in 5 seconds to generate arbitrary speech in real-time☆10Aug 1, 2019Updated 6 years ago
- Web application that powers weber-gesamtausgabe.de☆24Updated this week
- eXistdb App for ediarum.BASE.edit and ediarum.REGISTER.edit☆14Mar 1, 2024Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Tools to analyze the differences and similarities between CRISPR arrays☆11Dec 17, 2024Updated last year
- ☆26Apr 15, 2021Updated 4 years ago
- Tutorial on NE processing for Digital Humanities - DH Utrech 2019☆24Jul 18, 2019Updated 6 years ago
- ☆16Apr 11, 2023Updated 3 years ago
- MoJ coffee and coding sessions that can be made publicly available☆26May 24, 2021Updated 4 years ago
- Software package to design bridge RNAs as described by Durrant & Perry et al.☆24Sep 25, 2025Updated 6 months ago
- The motive of the project is to predict the Customer LifeTime Value of a Four Wheeler Insurance Company and it is implemented by satisfyi…☆16Jun 22, 2022Updated 3 years ago
- Word Factor Vectors☆32Dec 13, 2019Updated 6 years ago
- A collection of notebooks for Natural Language Processing☆25Jan 13, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- The GitHub repository containing all the material related to the Computational Thinking and Programming course of the Digital Humanities …☆30May 23, 2020Updated 5 years ago
- ☆15Jun 21, 2022Updated 3 years ago
- Parent repository for the MOJ Analytics Platform☆14Nov 16, 2021Updated 4 years ago
- Resources to help you get started with Data Science☆19Oct 1, 2018Updated 7 years ago
- Manager for remote ~/.ssh/authorized_keys☆13Mar 20, 2013Updated 13 years ago
- An opinion dynamic tracing model on dataset I☆17Dec 17, 2019Updated 6 years ago
- Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends☆2,069Updated this week