hannesrabo / simple-search-engineLinks
Indexing project where we index a portion of the web using spark, hadoop and cassandra.
☆21Updated 5 years ago
Alternatives and similar repositories for simple-search-engine
Users that are interested in simple-search-engine are comparing it to the libraries listed below
Sorting:
- Web crawling & scraping framework for Node.js on top of headless Chrome browser☆19Updated last year
- Index Common Crawl archives in tabular format☆122Updated last month
- Common crawl extractor☆76Updated last year
- CLI to verify an if an email address is deliverable. Uses SMTP to validate email addresses without sending an email.☆23Updated 3 months ago
- Fast and robust date extraction from web pages, with Python or on the command-line☆130Updated 5 months ago
- Hugging Face's Zapier Integration 🤗⚡️☆47Updated 2 years ago
- Parallel wasm Barnes-Hut t-SNE implementation written in Rust.☆21Updated last year
- The code that runs my blog: https://blog.gpt4.org/☆9Updated 3 years ago
- 🌿 The search engine in your codebase☆20Updated 3 years ago
- Geniusrise: Framework for building geniuses☆60Updated last year
- ☆84Updated last year
- ImageBind One Embedding Space to Bind Them All☆25Updated 2 years ago
- A demo that shows how to build a semantic search experience with Typesense's vector search feature and Instantsearch.js☆27Updated last year
- ☆21Updated last year
- Coldbrew is Python compiled into JavaScript using Emscripten.☆31Updated 2 years ago
- Semantic Code Search Using Vectorized Abstract Syntax Trees☆17Updated last year
- Automagically generates summaries from html or text.☆66Updated 2 years ago
- Generate product descriptions, blogs, ads and more using GPT architecture with a single request to TextCortex API a.k.a Hemingwai☆40Updated 2 years ago
- ☆18Updated 5 months ago
- An Infr app that helps you replay & talk to everything you've ever seen.☆16Updated last year
- A clone of OpenAI's Tokenizer page for HuggingFace Models☆45Updated last year
- utilities for loading and running text embeddings with onnx☆44Updated 10 months ago
- Benson turns a list of URLs into mp3s of the contents of each web page - take control over your reading backlog!☆14Updated 7 months ago
- A python utility for downloading Common Crawl data☆240Updated 2 years ago
- Training & Implementation of chatbots leveraging GPT-like architecture with the aitextgen package to enable dynamic conversations.☆49Updated 2 years ago
- Stream of my favorite papers and links☆41Updated 3 months ago
- An editor component that compiles, executes and returns the outputs of any user-written code via Pyodide/Emscripten/Webassembly.☆39Updated last year
- An Infr app that automates data collection from your PC, macOS or Linux client.☆11Updated last year
- AI Agent capable of automating various tasks using MCP☆37Updated 2 months ago
- ChatGPT Plugin to Semantically Search Google Maps☆45Updated 2 years ago