ScJa / document-search-engineLinks
A really fast document ranking engine using BM25 and TF-IDF. Based on Python using NLP packages NLTK and spacY.
☆15Updated 7 years ago
Alternatives and similar repositories for document-search-engine
Users that are interested in document-search-engine are comparing it to the libraries listed below
Sorting:
- Document Search Engine project with TF-IDF abd Google universal sentence encoder model☆55Updated 2 years ago
- Document Search Engine Tool☆76Updated 3 years ago
- An app that extracts your twitter threads into a downloadable CSV file.☆11Updated 2 years ago
- Simple pdf to text with python using PDFtk and PyPDF2☆21Updated 2 years ago
- Semantic Search Engine using BERT embeddings☆33Updated 5 years ago
- Sample datasets of over 400 Instagram coding influencers☆13Updated 11 months ago
- Semantic Segmentation of Legal texts that labels sentences with one of 7 rhetorical roles.☆78Updated last year
- Example for Logging LLM Evaluator Prompt Responses☆18Updated 2 years ago
- Summarize text content into a Tweet-sized statement using OpenAI's GPT-3 based Davinci model☆23Updated 2 years ago
- Fastlaw's purpose is to replace generic word embeddings for work on supervised machine learning NLP-tasks with legal texts.☆40Updated 6 years ago
- ☆64Updated 2 years ago
- Search with BERT vectors in Solr, Elasticsearch, OpenSearch and GSI APU☆166Updated last year
- ☆62Updated 2 years ago
- ☆13Updated 3 years ago
- Vector search demo with the arXiv paper dataset, RedisVL, HuggingFace, OpenAI, Cohere, FastAPI, React, and Redis.☆152Updated 9 months ago
- Building a bot to handle general tasks for insurance.☆27Updated 2 years ago
- NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, paraphrasing, …☆87Updated last year
- Expose a Top2Vec model with a REST API.☆92Updated 3 years ago
- A dataset for pretraining language models targeted for legal tasks.☆141Updated 3 years ago
- This repository serves as a collection of scrapers procuring and structuring various legal datasets☆18Updated 2 years ago
- A simple web application for searching Word2Vec embeddings derived from approximately 2,000 law reports published by the The Incorporated…☆26Updated 3 years ago
- Fully working applications that demonstrate how to use Haystack to implement various use cases☆135Updated 2 months ago
- Model for predicting categories of entities by its mentions☆31Updated 4 years ago
- Tutorial and template for a semantic search app powered by the Atlas Embedding Database, Langchain, OpenAI and FastAPI☆113Updated 2 years ago
- clustering news, extract trending news stories☆12Updated 4 years ago
- Testing speed and cost of classification via LLM or via vector embeddings☆21Updated 2 years ago
- Docutron Toolkit: detection and segmentation analysis for legal data extraction over documents.☆26Updated 2 years ago
- Open-source, knowledge-grounded conversational assistant☆14Updated 7 months ago
- GPTNERMED is a language model-generated, synthetic dataset and an open neural NER model for medical entities designed for German data.☆16Updated 2 years ago
- An analysis of abilities, skills and tech skills data from the O*NET database as well as classification of around 500 random LinkedIn job…☆19Updated 5 years ago