Butch78 / 1BillionRowChallengeLinks
I saw this [Blog Post](https://www.morling.dev/blog/one-billion-row-challenge/) on a Billion Row challenge for Java so naturally I tried implementing a solution in Python & Rust using mainly polars
☆14Updated last year
Alternatives and similar repositories for 1BillionRowChallenge
Users that are interested in 1BillionRowChallenge are comparing it to the libraries listed below
Sorting:
- Time series forecasting with DuckDB and Evidence☆41Updated 8 months ago
- Demo that extends the FastUI example & adds database persistence☆15Updated last year
- ☆46Updated 2 weeks ago
- Getting started with DuckDB, by Packt Publishing☆58Updated 11 months ago
- ☆37Updated 2 weeks ago
- Serverless for data practitioners. The fastest ⚡️ way to run your code in the cloud. Effortlessly run scripts, functions, and Jupyter not…☆39Updated last year
- A Python-based parallel file chunking system designed for processing large codebases into LLM-friendly chunks.☆41Updated this week
- Prism is the easiest way to develop, orchestrate, and execute data pipelines in Python.☆86Updated 7 months ago
- Analyzing hacker news in real-time with Bytewax and Proton☆39Updated last year
- Compression suite for data frames and tabular data files, csv, excel etc. Using LZHW algorithm.☆30Updated 11 months ago
- Concatenated documentation for use with LLMs☆40Updated last week
- A dev container with ollama and ollama examples with the Python OpenAI SDK☆53Updated 11 months ago
- Streamable multi-format serialization with schema☆22Updated 7 months ago
- Cloud Benchmarker automates performance testing of cloud instances, offering insightful charts and tracking over time.☆34Updated last year
- Build super simple end-to-end data & ETL pipelines for your vector databases and Generative AI applications☆98Updated 9 months ago
- ☆27Updated 10 months ago
- Deploying a FastAPI application to Cloudflare Workers with uv.☆71Updated 3 weeks ago
- pglineage is a tool to create data flow diagrams for PostgreSQL by analyzing SQL☆17Updated last year
- LLM plugin for models hosted by Anyscale Endpoints☆33Updated last year
- A Python library for real-time PostgreSQL event-driven cache invalidation.☆22Updated 3 months ago
- Handout for a talk I gave about LLM and CLI tools☆63Updated last year
- DuckDB Community Extension to prompt LLMs from SQL☆49Updated 6 months ago
- Slipstream provides a data-flow model to simplify development of stateful streaming applications.☆38Updated 2 months ago
- CLI for running files through AWS Textract☆54Updated last year
- Chatroom app where messages are sent to GPT, Claude, Mistral, Together, Grok, Groq, Google, vLLM, Ollama & streamed to the frontend.☆40Updated 3 weeks ago
- Build complete API integrations with YAML and SQL. Rapid development without vendor lock-in and per-row costs.☆83Updated last month
- A text-to-SQL prototype on the northwind sqlite dataset☆13Updated 9 months ago
- Duckdb extension for parsing the metadata and contents of the embedded data mode in PowerBI pbix files☆23Updated last month
- Find Python Packages on PyPI with the help of vector embeddings☆47Updated last month
- A high-performance data streaming system using DuckDB and Apache Arrow Flight.☆83Updated 4 months ago