ScJa / document-search-engineLinks
A really fast document ranking engine using BM25 and TF-IDF. Based on Python using NLP packages NLTK and spacY.
☆16Updated 7 years ago
Alternatives and similar repositories for document-search-engine
Users that are interested in document-search-engine are comparing it to the libraries listed below
Sorting:
- Document Search Engine project with TF-IDF abd Google universal sentence encoder model☆54Updated 2 years ago
- An app that extracts your twitter threads into a downloadable CSV file.☆10Updated 2 years ago
- Sample datasets of over 400 Instagram coding influencers☆12Updated 5 months ago
- Document Search Engine Tool☆73Updated 2 years ago
- A simple web application for searching Word2Vec embeddings derived from approximately 2,000 law reports published by the The Incorporated…☆25Updated 2 years ago
- clustering news, extract trending news stories☆12Updated 4 years ago
- Semantic Segmentation of Legal texts that labels sentences with one of 7 rhetorical roles.☆75Updated last year
- Semantic Search Engine using BERT embeddings☆33Updated 4 years ago
- A flask UI template for reviewing logs and reflections from wrappers_delight.☆9Updated last year
- Testing speed and cost of classification via LLM or via vector embeddings☆20Updated 2 years ago
- DocQues answers queries on longer and multiple documents build on GPT-Index and GPT-3☆13Updated 2 years ago
- Document level Attitude and Relation Extraction toolkit (AREkit) for sampling and processing large text collections with ML and for ML☆63Updated 6 months ago
- This repository serves as a collection of scrapers procuring and structuring various legal datasets☆17Updated 2 years ago
- Summarize text content into a Tweet-sized statement using OpenAI's GPT-3 based Davinci model☆23Updated last year
- Mining Legal Arguments in Court Decisions - Data and software☆68Updated 2 years ago
- NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, paraphrasing, …☆83Updated 8 months ago
- Docutron Toolkit: detection and segmentation analysis for legal data extraction over documents.☆26Updated last year
- Expose a Top2Vec model with a REST API.☆91Updated 2 years ago
- GPTNERMED is a language model-generated, synthetic dataset and an open neural NER model for medical entities designed for German data.☆16Updated last year
- Writing Primer for Data Scientists☆18Updated 5 years ago
- Example for Logging LLM Evaluator Prompt Responses☆18Updated last year
- This is an application that automates the process of text analysis with a user-friendly GUI. 📱 It has been implemented using Python and …☆37Updated 3 years ago
- [WIP] Behold, semantic-search, built over sentence-transformers to make it easy for search engineers to evaluate, optimise and deploy mod…☆15Updated 2 years ago
- Fully working applications that demonstrate how to use Haystack to implement various use cases☆125Updated last week
- StAtutory Reasoning Assessment☆14Updated 2 years ago
- Model for predicting categories of entities by its mentions☆29Updated 4 years ago
- FRAKE: Fusional Real-time Automatic Keyword Extraction☆21Updated 2 years ago
- ☆60Updated last year
- LegalCrawler: A tool for automated scraping of English legal corpora☆54Updated 2 years ago
- HDBSCAN Tuning for BERTopic Models☆48Updated 2 years ago