ScJa / document-search-engineLinks
A really fast document ranking engine using BM25 and TF-IDF. Based on Python using NLP packages NLTK and spacY.
☆16Updated 7 years ago
Alternatives and similar repositories for document-search-engine
Users that are interested in document-search-engine are comparing it to the libraries listed below
Sorting:
- Document Search Engine project with TF-IDF abd Google universal sentence encoder model☆54Updated 2 years ago
- Information Retrieval system built by BERT and elasticsearch☆14Updated 5 years ago
- A simple web application for searching Word2Vec embeddings derived from approximately 2,000 law reports published by the The Incorporated…☆25Updated 2 years ago
- Sample datasets of over 400 Instagram coding influencers☆12Updated 4 months ago
- Semantic Search Engine using BERT embeddings☆33Updated 4 years ago
- An app that extracts your twitter threads into a downloadable CSV file.☆10Updated 2 years ago
- This repository contains code and data download instructions for the workshop paper "Improving Hierarchical Product Classification using …☆17Updated 4 years ago
- Semantic Segmentation of Legal texts that labels sentences with one of 7 rhetorical roles.☆73Updated last year
- Developing a Knowledge Graph-based Question and Answering program to extract information from huge dataset☆96Updated 2 years ago
- semantically distinct key phrase extraction using hilbert hashes.☆50Updated 3 years ago
- new skills taxonomy using TextKernel data☆33Updated 2 years ago
- Analyze and extract Wikipedia article text and attributes and store them into an ElasticSearch index or to json files (multilingual suppo…☆47Updated last year
- clustering news, extract trending news stories☆12Updated 4 years ago
- Graph databases, Knowledge Graphs, SPARQ☆80Updated 3 years ago
- ☆28Updated 4 years ago
- Question Classification for the dataset CogComp QC Dataset - [ http://cogcomp.org/Data/QA/QC/ ].☆29Updated 4 years ago
- This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around t…☆33Updated 2 years ago
- Dice.com repo to accompany the dice.com 'Vectors in Search' talk by Simon Hughes, from the Activate 2018 search conference, and the 'Sear…☆86Updated 4 years ago
- ☆70Updated 4 years ago
- Document level Attitude and Relation Extraction toolkit (AREkit) for sampling and processing large text collections with ML and for ML☆63Updated 6 months ago
- Tutorial and template for a semantic search app powered by the Atlas Embedding Database, Langchain, OpenAI and FastAPI☆115Updated last year
- Expose a Top2Vec model with a REST API.☆90Updated 2 years ago
- A personal knowledge base that I can dump information to and help me learn☆24Updated last month
- A Python pipeline tool and plugin ecosystem for processing technical documents. Process papers from arXiv, SemanticScholar, PDF, with GRO…☆51Updated 4 months ago
- This repo is about the classification of rhetorical roles in Legal Documents such as: Citation, Findings of Fact, Evidence, Legal Rule, R…☆14Updated 3 years ago
- Testing speed and cost of classification via LLM or via vector embeddings☆20Updated last year
- Extracting narrative timelines (i.e. order and timing of events) from text☆20Updated 6 years ago
- AI models for automatic job application pipeline (user CV, job description analysis (customized NER/SpaCy) and artificial cover letter ge…☆36Updated last year
- Document Search Engine Tool☆73Updated 2 years ago
- Named entity recognition for the legal domain☆42Updated 4 years ago