ScJa / document-search-engineLinks
A really fast document ranking engine using BM25 and TF-IDF. Based on Python using NLP packages NLTK and spacY.
☆15Updated 7 years ago
Alternatives and similar repositories for document-search-engine
Users that are interested in document-search-engine are comparing it to the libraries listed below
Sorting:
- Document Search Engine Tool☆77Updated 3 years ago
- Document Search Engine project with TF-IDF abd Google universal sentence encoder model☆55Updated 2 years ago
- An app that extracts your twitter threads into a downloadable CSV file.☆11Updated 2 years ago
- Sample datasets of over 400 Instagram coding influencers☆13Updated 11 months ago
- Semantic Segmentation of Legal texts that labels sentences with one of 7 rhetorical roles.☆78Updated last year
- Semantic Search Engine using BERT embeddings☆33Updated 5 years ago
- GPTNERMED is a language model-generated, synthetic dataset and an open neural NER model for medical entities designed for German data.☆16Updated 2 years ago
- Document level Attitude and Relation Extraction toolkit (AREkit) for sampling and processing large text collections with ML and for ML☆65Updated last year
- A simple web application for searching Word2Vec embeddings derived from approximately 2,000 law reports published by the The Incorporated…☆26Updated 3 years ago
- A dataset for pretraining language models targeted for legal tasks.☆141Updated 3 years ago
- The official gpt4free repository | various collection of powerful language models☆10Updated last year
- This repository serves as a collection of scrapers procuring and structuring various legal datasets☆18Updated 2 years ago
- Write beautifully short contract. https://reference.legal/ is a referenceable clause library to standardize contracts once and for all.☆13Updated 3 years ago
- Large Language Models (LLMs) and Generative Pre-trained Transformers (GPTs) for Legal☆100Updated 2 years ago
- ☆62Updated 2 years ago
- This repo is about the classification of rhetorical roles in Legal Documents such as: Citation, Findings of Fact, Evidence, Legal Rule, R…☆16Updated 3 years ago
- CaseText Court Case analysis with fine-tuned BERT Transformer☆14Updated 5 years ago
- Target-dependent sentiment classification in news articles reporting on political events. Includes a high-quality data set of over 11k se…☆156Updated 6 months ago
- A Python pipeline tool and plugin ecosystem for processing technical documents. Process papers from arXiv, SemanticScholar, PDF, with GRO…☆53Updated 10 months ago
- GPT-3 Chatbot with long-term memory and external sources. Original work & inspiration by @daveshap☆17Updated 3 years ago
- Used Python, NLTK, NLP techniques to make a search engine that ranks documents based on search keyword, based on TF-IDF weights and cosin…☆17Updated 8 years ago
- LegalCrawler: A tool for automated scraping of English legal corpora☆59Updated 3 years ago
- Fastlaw's purpose is to replace generic word embeddings for work on supervised machine learning NLP-tasks with legal texts.☆40Updated 6 years ago
- Tutorial and template for a semantic search app powered by the Atlas Embedding Database, Langchain, OpenAI and FastAPI☆114Updated 2 years ago
- COVID-19 Open Research Dataset (CORD-19) Analysis☆57Updated 3 years ago
- Lobe is the world's first AI paralegal.☆51Updated 3 years ago
- This is a proof-of-concept of using an LLM to find and extract meaningful data without parsing the html too much.☆30Updated 2 years ago
- Docutron Toolkit: detection and segmentation analysis for legal data extraction over documents.☆26Updated 2 years ago
- TextReducer - A Tool for Summarization and Information Extraction☆85Updated last year
- NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, paraphrasing, …☆87Updated last year