Butch78 / 1BillionRowChallengeLinks
I saw this [Blog Post](https://www.morling.dev/blog/one-billion-row-challenge/) on a Billion Row challenge for Java so naturally I tried implementing a solution in Python & Rust using mainly polars
☆14Updated last year
Alternatives and similar repositories for 1BillionRowChallenge
Users that are interested in 1BillionRowChallenge are comparing it to the libraries listed below
Sorting:
- Analyzing hacker news in real-time with Bytewax and Proton☆39Updated last year
- Time series forecasting with DuckDB and Evidence☆39Updated 7 months ago
- A Python-based parallel file chunking system designed for processing large codebases into LLM-friendly chunks.☆39Updated last week
- ☆37Updated last week
- ☆27Updated 8 months ago
- Slipstream provides a data-flow model to simplify development of stateful streaming applications.☆36Updated last month
- Streamable multi-format serialization with schema☆22Updated 5 months ago
- Sequor is a SQL-centric platform for building API integrations without lock-in and black boxes. Fuses API execution with SQL logic to pro…☆65Updated this week
- Run transcriptions using the OpenAI Whisper API☆24Updated 7 months ago
- A dev container with ollama and ollama examples with the Python OpenAI SDK☆52Updated 9 months ago
- pglineage is a tool to create data flow diagrams for PostgreSQL by analyzing SQL☆16Updated last year
- Plugin for LLM adding support for Google's PaLM 2 model☆14Updated last year
- Example usages of the Scaffoldly toolchain.☆16Updated 5 months ago
- Lightweight, open source, locally-hosted Modern Data Stack☆15Updated 2 months ago
- ☆36Updated this week
- Sculpt: Structuring unstructured data with LLMs☆32Updated last month
- A text-to-SQL prototype on the northwind sqlite dataset☆12Updated 8 months ago
- Versatile Metrics Collection for Python☆19Updated last year
- Create embeddings for LLM using the Nomic API☆23Updated 6 months ago
- A Python library for real-time PostgreSQL event-driven cache invalidation.☆22Updated last month
- A simple Python script to collate multiple PDFs into a single PDF.☆26Updated 7 months ago
- Feature selection for tabular datasets using advanced filter and wrapper methods☆17Updated 2 months ago
- A lightweight code assistant with tool-using capabilities built on HuggingFace's smolagents.☆23Updated this week
- convert natural language into technical diagrams☆14Updated 5 months ago
- Display version and compression information about a parquet file☆23Updated this week
- A CLI that gives you more granular control over bulk deletion of your Github gists.☆13Updated 4 months ago
- API Framework heavily relying on the power of DuckDB and DuckDB extensions. Ready to build performant and cost-efficient APIs on top of B…☆29Updated this week
- Cloud Benchmarker automates performance testing of cloud instances, offering insightful charts and tracking over time.☆35Updated last year
- DuckDB Extension for cryptographic hash functions and HMAC☆19Updated last month
- Concatenated documentation for use with LLMs☆36Updated last week