ScJa / document-search-engine
A really fast document ranking engine using BM25 and TF-IDF. Based on Python using NLP packages NLTK and spacY.
☆15Updated 6 years ago
Alternatives and similar repositories for document-search-engine:
Users that are interested in document-search-engine are comparing it to the libraries listed below
- Document Search Engine project with TF-IDF abd Google universal sentence encoder model☆53Updated last year
- Semantic Search Engine using BERT embeddings☆33Updated 4 years ago
- A simple web application for searching Word2Vec embeddings derived from approximately 2,000 law reports published by the The Incorporated…☆26Updated 2 years ago
- Information Retrieval system built by BERT and elasticsearch☆14Updated 5 years ago
- An evaluation of word-embeddings for classification☆32Updated 6 years ago
- Template Extraction from unstructured Wikipedia text using NLP techniques.☆41Updated 4 years ago
- Alternate Implementation for Zero Shot Text Classification: Instead of reframing NLI/XNLI, this reframes the text backbone of CLIP models…☆37Updated 2 years ago
- StAtutory Reasoning Assessment☆13Updated 2 years ago
- sequence tagging with spaCy and crfsuite☆19Updated last year
- Graph databases, Knowledge Graphs, SPARQ☆76Updated 3 years ago
- Many Natural Language Processing tasks rely on sentence boundary detection (SBD). Although amazing libraries like spacy provide state of …☆61Updated 4 years ago
- Mining Legal Arguments in Court Decisions - Data and software☆66Updated last year
- This repo is about the classification of rhetorical roles in Legal Documents such as: Citation, Findings of Fact, Evidence, Legal Rule, R…☆14Updated 3 years ago
- Using questions to summarize large amounts of textual data.☆25Updated 4 years ago
- Semantic Segmentation of Legal texts that labels sentences with one of 7 rhetorical roles.☆70Updated 8 months ago
- LegalCrawler: A tool for automated scraping of English legal corpora☆53Updated 2 years ago
- A T5 based sequence generation model for WikiSQL task. Achieving 90.3% on test data set using sequence generation.☆17Updated 4 years ago
- ☆28Updated 4 years ago
- Explainable Zero-Shot Topic Extraction☆62Updated 6 months ago
- ☆92Updated 2 years ago
- Ranking documents using semantic similarity in Python☆35Updated 4 years ago
- Language detection using Spacy and Fasttext☆55Updated last year
- Sample datasets of over 400 Instagram coding influencers☆11Updated 2 months ago
- This repository contains code and data download instructions for the workshop paper "Improving Hierarchical Product Classification using …☆17Updated 3 years ago
- An implementation of bidirectional LSTM-CRF for Named Entity Relationship on custom corpus with custom word embeddings☆13Updated 5 years ago
- Question Classification for the dataset CogComp QC Dataset - [ http://cogcomp.org/Data/QA/QC/ ].☆29Updated 4 years ago
- Build a deep learning model for predicting the named entities from text.☆56Updated 6 years ago
- Text similarity using BERT sentence embeddings☆20Updated 4 years ago
- Conversational dataset from the Chit-Chat Challenge☆25Updated last year
- No Teacher BART distillation experiment for NLI tasks☆27Updated 4 years ago