Blacksuan19 / scrapy-aiLinks
Fully automated AI based web scraping.
☆33Updated 10 months ago
Alternatives and similar repositories for scrapy-ai
Users that are interested in scrapy-ai are comparing it to the libraries listed below
Sorting:
- Build super simple end-to-end data & ETL pipelines for your vector databases and Generative AI applications☆106Updated last year
- Spider ported to Python☆100Updated 11 months ago
- This is a proof-of-concept of using an LLM to find and extract meaningful data without parsing the html too much.☆30Updated 2 years ago
- scraping and querying documents for LLMs☆24Updated 2 months ago
- Python SDK for Inngest: Durable functions and workflows in Python, hosted anywhere☆167Updated last week
- Python SDK for Browserbase☆70Updated 2 weeks ago
- ☆22Updated last year
- Spider templates for automatic crawlers.☆33Updated 3 weeks ago
- A GPT powered CLI tool that answers questions about your data☆97Updated 2 years ago
- Docx tracked change redlines for the Python ecosystem.☆95Updated last year
- Effortlessly extract information from unstructured data with this library, utilizing advanced AI techniques. Compose AI in customizable p…☆87Updated last year
- A FastAPI extension for integrating common AI agent frameworks.☆46Updated 11 months ago
- ☆77Updated 3 weeks ago
- Visual Studio Code extension to convert HTML to FastHTML FT☆22Updated 10 months ago
- 🏗️ Create APIs from CSV files within seconds, using fastapi☆79Updated 4 years ago
- https://verdad.app☆83Updated last week
- A pattern to let you try several vector databases and change a little code as possible☆38Updated 2 years ago
- A Python-based parallel file chunking system designed for processing large codebases into LLM-friendly chunks.☆46Updated 4 months ago
- Handout for a talk I gave about LLM and CLI tools☆62Updated last year
- ☆54Updated 8 months ago
- Clean, filter and sample URLs to optimize data collection – Python & command-line – Deduplication, spam, content and language filters☆155Updated 2 weeks ago
- Semantic Search + Keyword Search + Hybrid Search + Filtering + Faceting on 300K HN Comments☆55Updated last year
- Benchmark study on LanceDB, an embedded vector DB, for full-text search and vector search☆29Updated 2 years ago
- ☆20Updated 9 months ago
- Repo to experiment with Graph RAG strategies using Kùzu☆63Updated 3 months ago
- RAG for any docs hosted on readthedocs☆41Updated last year
- List of entity resolution software and resources.☆103Updated 10 months ago
- simplifies the process of creating and managing LLM workflows.☆114Updated last year
- download and view the contents of a GitHub repository or a ZIP file as a single text file☆43Updated last year
- Async bulk data ingestion and querying in various document, graph and vector databases via their Python clients☆40Updated 2 years ago