A self-hosted version of WaterCrawl, a powerful web crawling and data extraction platform.
☆13Jul 27, 2025Updated 7 months ago
Alternatives and similar repositories for self-hosted
Users that are interested in self-hosted are comparing it to the libraries listed below
Sorting:
- OmniByteFormer is a generalized Transformer model that can process any type of data by converting it into byte sequences, bypassing tradi…☆15Feb 23, 2026Updated last week
- Transform unstructured documents into actionable, structured data with enterprise-grade precision and reliability, ready for large-scale …☆20Oct 13, 2025Updated 4 months ago
- A Gradio web UI for Large Language Models. Supports LoRA/QLoRA finetuning,RAG(Retrieval-augmented generation) and Chat☆36Nov 26, 2023Updated 2 years ago
- NDIToolbox is an open source extensible signal and image processing application under development by TRI/Austin designed to assist with t…☆10Aug 19, 2018Updated 7 years ago
- Emotion based music recommender system☆11Mar 26, 2025Updated 11 months ago
- A Next.js chat app to use Llama 2 locally using node-llama-cpp☆12Oct 27, 2024Updated last year
- Rivet plugin to access E2B goodies☆10Feb 6, 2025Updated last year
- In-browser semantic search demo using EmbeddingGemma and Transformers.js. No server required.☆30Sep 7, 2025Updated 5 months ago
- A jekyll template for easy creation of course websites. Checkout the template here:☆11Aug 1, 2024Updated last year
- This plugin provides tools to extract text from a document using the Azure AI Document Intelligence service.☆12Jan 17, 2025Updated last year
- Templates for musical textual inversion for riffusion☆11Apr 14, 2023Updated 2 years ago
- Write your next novel faster and easier☆15Dec 7, 2025Updated 2 months ago
- Huggingface Backup - Jupyter, Colab and Python Script☆10Jan 20, 2026Updated last month
- Gradio chat interface for FastMLX☆12Sep 22, 2024Updated last year
- Caddy module: dns.providers.gandi☆17Jul 15, 2025Updated 7 months ago
- 基于动态图数据库的动态超图知识检索系统,特性:五重检索内核(Vector语义、BM25关键词、Graph动态推理、上下文关联、实体多跳推理)、全属性实时演化、Agent语义重叠率智能自维护机制、轻量化超图架构。☆20Oct 10, 2025Updated 4 months ago
- A detail Implementation of handling long-term memory in Agentic AI☆36Oct 9, 2025Updated 4 months ago
- A knowledge graph based forward chain inferencing engine in typescript/node.☆11Jan 23, 2021Updated 5 years ago
- Streamlines the creation of dataset to train a Large Language Model with triplets : instruction-input-output . The default configuration …☆13Apr 17, 2023Updated 2 years ago
- Code and data for NAACL 2025 paper "IHEval: Evaluating Language Models on Following the Instruction Hierarchy"☆17Feb 25, 2025Updated last year
- This project aims to utilize Generative AI for the next marketing strategy in the case of e-commerce customer segmentation.☆12Mar 19, 2024Updated last year
- ☆11Aug 26, 2024Updated last year
- Deepractice Role System☆31Updated this week
- Code repository for TIDMAD: Time series Dataset for Discovering Dark Matter with AI Denoising.☆15Oct 23, 2025Updated 4 months ago
- [ICLR 2025 SSI-FM] Self-Taught Self-Correction for Small Language Models☆11Sep 19, 2025Updated 5 months ago
- Mixture of Expert (MoE) techniques for enhancing LLM performance through expert-driven prompt mapping and adapter combinations.☆12Feb 11, 2024Updated 2 years ago
- Guide to Installing Ragflow on Google Cloud Compute Engine☆13Sep 12, 2024Updated last year
- ☆10Sep 26, 2025Updated 5 months ago
- ☆12Feb 16, 2026Updated 2 weeks ago
- Easy OpenCV Python Object Tracking Application using selectROI☆16Jun 9, 2020Updated 5 years ago
- ☆12Jul 29, 2025Updated 7 months ago
- Various agents from all of the top agent frameworks to integrate into swarms! Langchain, Griptape, CrewAI, and more!☆18Dec 22, 2025Updated 2 months ago
- This repo contains the code for demonstrating how to using LlamaEdge RAG to build a RAG app☆13May 15, 2024Updated last year
- Automate the batch upload and parsing of documents into Dify's knowledge base, reducing manual intervention and wait time.☆15Aug 29, 2024Updated last year
- Empowering Tomorrow Together: Your Community-Powered AI Platform☆14Aug 19, 2024Updated last year
- page and script for generating chladni figures☆11Jul 19, 2017Updated 8 years ago
- My Gen AI research☆11Jun 3, 2024Updated last year
- The Web Metadata Extraction Toolkit is designed to streamline the process of extracting, cleaning, and analyzing metadata from websites. …☆18Jul 8, 2024Updated last year
- Comfyui custom node for FunAudioLLM include CosyVoice2, SenseVoice and InspireMusic With BreezyVoice Support☆24Jul 6, 2025Updated 7 months ago