commoncrawl / whirlwind-pythonLinks
A whirlwind tour of Common Crawl's data using Python
☆17Updated 5 months ago
Alternatives and similar repositories for whirlwind-python
Users that are interested in whirlwind-python are comparing it to the libraries listed below
Sorting:
- Quality News - Towards a fairer ranking formula for Hacker News☆82Updated last month
- Turn your git commit history into a scientific log☆46Updated 3 months ago
- A Datasette plugin providing an MLOps platform to train, eval and predict machine learning models☆16Updated this week
- Open source scholarly literature search☆16Updated 8 months ago
- arXiv fragment loader plugin for https://llm.datasette.io/☆14Updated 3 weeks ago
- Questions from the Ham Radio General pool☆14Updated last year
- Data conversion utility☆39Updated 2 years ago
- A Higher-Level, Composable SQL☆44Updated last week
- A text-to-SQL prototype on the northwind sqlite dataset☆12Updated 8 months ago
- Create embeddings for LLM using the Nomic API☆23Updated 6 months ago
- A cli client for csvbase☆48Updated 10 months ago
- Optimum graph creation and distribution for underground networks.☆34Updated 11 months ago
- Source files for the Open, Transparent, and Reproducible Data Science Handbook☆49Updated last year
- Concatenated documentation for use with LLMs☆36Updated last week
- Datasette plugin for searching all searchable tables at once☆24Updated 9 months ago
- Python packaging scenarios☆119Updated this week
- Parallelism and preemptive concurrency for sporadic workloads☆46Updated 6 months ago
- xargs for semgrep☆28Updated last year
- Use triggers to track when rows in a SQLite table were updated or deleted☆45Updated 3 weeks ago
- Scripts to make specific datasets cleaner and more convenient☆41Updated 2 years ago
- GitHub statistics☆12Updated 2 years ago
- A diagram of my personal infrastructure☆49Updated 4 years ago
- A simple Python script to collate multiple PDFs into a single PDF.☆26Updated 8 months ago
- "llm python" is a command to run a Python interpreter in the LLM virtual environment☆33Updated last year
- Testing various image matching algorithms' performance on the Pinecone vector DB☆43Updated last year
- Streamable multi-format serialization with schema☆22Updated 5 months ago
- Read & write JavaScript values from Python with the V8 serialization format.☆16Updated 5 months ago
- Slipstream provides a data-flow model to simplify development of stateful streaming applications.☆36Updated last month
- The Endatabas Book☆16Updated 9 months ago
- A probabilistic approximate DNF counter☆37Updated this week