commoncrawl / whirlwind-python
A whirlwind tour of Common Crawl's data using Python
☆17Updated 3 months ago
Alternatives and similar repositories for whirlwind-python:
Users that are interested in whirlwind-python are comparing it to the libraries listed below
- Quality News - Towards a fairer ranking formula for Hacker News☆81Updated 3 weeks ago
- create local malicious git repo☆50Updated this week
- A diagram of my personal infrastructure☆48Updated 4 years ago
- Walk the AST for every callable in your code.☆19Updated 3 months ago
- Git scrapers for scraping the fediverse☆16Updated this week
- Podlite specification documents ( v1.0 released 🎉 )☆23Updated last month
- A probabilistic approximate DNF counter☆36Updated 11 months ago
- Concatenated documentation for use with LLMs☆17Updated last month
- Create matplotlib visualizations from the command-line☆50Updated 2 years ago
- Open source scholarly literature search☆16Updated 6 months ago
- Create embeddings for LLM using the Nomic API☆23Updated 4 months ago
- Git worktree navigator☆28Updated last year
- Parallelism and preemptive concurrency for sporadic workloads☆46Updated 4 months ago
- Turn your git commit history into a scientific log☆45Updated last month
- A cli client for csvbase☆48Updated 8 months ago
- A Higher-Level, Composable SQL☆43Updated this week
- CLI tool for exploring arXiv (inspired by karpathy's brilliant ArXiv Sanity Preserver)☆39Updated last month
- Tools for running OCR against files stored in S3☆119Updated 2 years ago
- The Endatabas Book☆16Updated 7 months ago
- "llm python" is a command to run a Python interpreter in the LLM virtual environment☆31Updated last year
- Beating the `bisect` module's implementation using C-extensions.☆30Updated last year
- Web interface for searching your code using ripgrep, built as a Datasette plugin☆74Updated last year
- ☆13Updated last year
- Handy decorator for elegant design-by-contract in 3.10+☆102Updated 2 years ago
- A [personal]<-[notebook]->[network]. Complete with custom numerics for constrained Gaussian gravitation physics.☆22Updated 3 years ago
- 🗄️ A simple CLI for converting WARC to Parquet.☆109Updated last month
- Tools for running enrichments against data stored in Datasette☆23Updated 2 months ago
- Testing various image matching algorithms' performance on the Pinecone vector DB☆43Updated last year
- Scale to zero Seafowl hosting with Cloud Run☆38Updated last year
- Optimum graph creation and distribution for underground networks.☆33Updated 9 months ago