A whirlwind tour of Common Crawl's data using Python
☆35Feb 17, 2026Updated last week
Alternatives and similar repositories for whirlwind-python
Users that are interested in whirlwind-python are comparing it to the libraries listed below
Sorting:
- Add your configs for tmux☆18Apr 3, 2022Updated 3 years ago
- A tool for detecting viruses and NSFW material in WARC files☆17Dec 16, 2025Updated 2 months ago
- Create a playlist on Spotify by writing it as a Terraform configuration☆25Mar 8, 2023Updated 2 years ago
- Software Engineering Back End Microservices Project☆15Nov 20, 2024Updated last year
- ☆17Feb 20, 2026Updated last week
- Platform services OCP project registry☆18Updated this week
- MATLAB/Octave generator of Hamming ECC coding. Output format is Verilog HDL.☆12Dec 27, 2022Updated 3 years ago
- Like Prometheus, but for logs.☆10Updated this week
- A comprehensive ELT pipeline for analyzing passenger satisfaction data. Features a modern data architecture with Apache Airflow for extra…☆12Oct 5, 2025Updated 4 months ago
- script to recursively check, standardize, import and export embedded and external synced and unsynced lyrics of audio files☆18Mar 17, 2025Updated 11 months ago
- ☆12Aug 29, 2020Updated 5 years ago
- CommonCrawl WARC/WET/WAT examples and processing code for Java + Hadoop☆37Dec 17, 2024Updated last year
- ☆12Updated this week
- ☆12Jan 6, 2023Updated 3 years ago
- UserScript Extension To Automatically Emulate Human Typing In Google Docs☆18Jan 11, 2025Updated last year
- An open source, catch-all replacement to websites like TappedOut, MTG Goldfish, DeckStats, DeckBox, TCGPlayer and any other website that …☆10Oct 15, 2024Updated last year
- Associated blog post - https://tristanrhodes.com/blog/Adventures-in-Algorithmic-Trading-on-the-Runescape-Grand-Exchange☆10Oct 14, 2024Updated last year
- Datasette plugin for working with Apple's binary plist format☆13Feb 17, 2023Updated 3 years ago
- An automatic question generation system using rule based NLP processing techniques.☆10Feb 9, 2020Updated 6 years ago
- Summarize and ask questions about items in the Internet Archive☆18Apr 1, 2023Updated 2 years ago
- The Official Lando Pantheon plugin.☆14Feb 22, 2026Updated last week
- ☆11Oct 21, 2024Updated last year
- Quickly insert image in neovim☆10May 11, 2024Updated last year
- 中文错别字纠正工具。音似、形似错字(或变体字)纠正,可用于中文拼音、笔画输入法的错误纠正。python开发。☆10Mar 5, 2018Updated 7 years ago
- CveBinarySheet: A Comprehensive Pre-built Binaries Database Focused on IoT Vulnerability Scenarios☆15Jan 17, 2025Updated last year
- This is a demonstration of k-means algorithm on customer data of a mall (also known as customer segmentation).☆14Feb 19, 2021Updated 5 years ago
- ☆17Updated this week
- The goal of this project was to develop a chat-bot based data collection tool. It asks users questions through a validated alignment surv…☆13Feb 20, 2026Updated last week
- ☆11Jan 20, 2017Updated 9 years ago
- A testing framework and a set of test suites used for testing GCE Images.☆15Updated this week
- c++/Qt library for displaying and interacting with cellular automata☆11Aug 15, 2021Updated 4 years ago
- Upload SQLite database files to Datasette☆14Nov 10, 2025Updated 3 months ago
- Rewrapping FieryIceStickie's Deobfuscation Tools☆11Feb 2, 2026Updated last month
- A library of prompts, intended for LibreChat. Note: Archived☆11Aug 14, 2023Updated 2 years ago
- media.ccc.de media library application for webOS. Think of it as a Netflix for hackers☆16Jan 18, 2026Updated last month
- ☆16Nov 26, 2024Updated last year
- this is less copy plugin☆11Mar 29, 2023Updated 2 years ago
- vim-bootstrap plugin to upgrade☆14Nov 7, 2021Updated 4 years ago
- Code repository for the paper on "Predicting the Performance of Black-Box LLMs through Self-Queries".☆12Jan 9, 2025Updated last year