ScJa / document-search-engineLinks
A really fast document ranking engine using BM25 and TF-IDF. Based on Python using NLP packages NLTK and spacY.
☆16Updated 7 years ago
Alternatives and similar repositories for document-search-engine
Users that are interested in document-search-engine are comparing it to the libraries listed below
Sorting:
- Document Search Engine project with TF-IDF abd Google universal sentence encoder model☆54Updated 2 years ago
- Document Search Engine Tool☆74Updated 2 years ago
- Semantic Segmentation of Legal texts that labels sentences with one of 7 rhetorical roles.☆77Updated last year
- Write beautifully short contract. https://reference.legal/ is a referenceable clause library to standardize contracts once and for all.☆13Updated 3 years ago
- An app that extracts your twitter threads into a downloadable CSV file.☆11Updated 2 years ago
- A simple web application for searching Word2Vec embeddings derived from approximately 2,000 law reports published by the The Incorporated…☆25Updated 2 years ago
- Sample datasets of over 400 Instagram coding influencers☆13Updated 6 months ago
- Semantic Search Engine using BERT embeddings☆33Updated 4 years ago
- This repository serves as a collection of scrapers procuring and structuring various legal datasets☆18Updated 2 years ago
- Simple pdf to text with python using PDFtk and PyPDF2☆21Updated last year
- Mining Legal Arguments in Court Decisions - Data and software☆69Updated 2 years ago
- Expose a Top2Vec model with a REST API.☆92Updated 2 years ago
- Solve Geometric & Graph Problems with Large Language Models☆33Updated 2 years ago
- LegalCrawler: A tool for automated scraping of English legal corpora☆56Updated 3 years ago
- A dataset for pretraining language models targeted for legal tasks.☆139Updated 3 years ago
- Document level Attitude and Relation Extraction toolkit (AREkit) for sampling and processing large text collections with ML and for ML☆63Updated 7 months ago
- ☆13Updated 3 years ago
- simple rule based named entity recognition☆42Updated 3 years ago
- clustering news, extract trending news stories☆12Updated 4 years ago
- Developing a Knowledge Graph-based Question and Answering program to extract information from huge dataset☆95Updated 2 years ago
- Template Extraction from unstructured Wikipedia text using NLP techniques.☆41Updated 5 years ago
- GPTNERMED is a language model-generated, synthetic dataset and an open neural NER model for medical entities designed for German data.☆16Updated last year
- CaseText Court Case analysis with fine-tuned BERT Transformer☆15Updated 5 years ago
- ☆64Updated last year
- Vectorizing knowledge bases for entity linking☆15Updated 4 years ago
- Docutron Toolkit: detection and segmentation analysis for legal data extraction over documents.☆26Updated last year
- Kelvin Legal Data OS - Public Examples☆19Updated last year
- Model for predicting categories of entities by its mentions☆29Updated 4 years ago
- Text similarity using BERT sentence embeddings☆20Updated 5 years ago
- Analyze and extract Wikipedia article text and attributes and store them into an ElasticSearch index or to json files (multilingual suppo…☆47Updated 2 years ago