Butch78 / 1BillionRowChallengeLinks
I saw this [Blog Post](https://www.morling.dev/blog/one-billion-row-challenge/) on a Billion Row challenge for Java so naturally I tried implementing a solution in Python & Rust using mainly polars
☆14Updated last year
Alternatives and similar repositories for 1BillionRowChallenge
Users that are interested in 1BillionRowChallenge are comparing it to the libraries listed below
Sorting:
- I will be adding different kind of opensource data extraction tools code using python☆10Updated 7 months ago
- Create embeddings for LLM using the Nomic API☆23Updated 7 months ago
- A Python-based parallel file chunking system designed for processing large codebases into LLM-friendly chunks.☆41Updated last month
- Have UV deal with all your Jupyter deps.☆26Updated 9 months ago
- recipes for BASH, Docker and more☆13Updated 4 months ago
- ☆27Updated 9 months ago
- Time series forecasting with DuckDB and Evidence☆39Updated 7 months ago
- Analyzing hacker news in real-time with Bytewax and Proton☆39Updated last year
- Serverless for data practitioners. The fastest ⚡️ way to run your code in the cloud. Effortlessly run scripts, functions, and Jupyter not…☆39Updated last year
- Quick overview of duckdb, pandas and polars through a simple data pipeline.☆14Updated 2 years ago
- A CLI that gives you more granular control over bulk deletion of your Github gists.☆13Updated 5 months ago
- Scripts and ideas to manage tons and tons of images and movies☆17Updated 3 months ago
- Semantic caching layer for your LLM applications. Reuse responses and reduce token usage.☆82Updated this week
- Concatenated documentation for use with LLMs☆38Updated last week
- ☆37Updated last week
- Streamable multi-format serialization with schema☆22Updated 6 months ago
- Run transcriptions using the OpenAI Whisper API☆24Updated 8 months ago
- A monorepo of many Rill example projects☆39Updated this week
- Secure, locally-run Retrieval-Augmented Generation system for document-based question-answering, utilizing Llama 3, Mistral, and Gemini m…☆24Updated 8 months ago
- Cloud Benchmarker automates performance testing of cloud instances, offering insightful charts and tracking over time.☆33Updated last year
- "llm python" is a command to run a Python interpreter in the LLM virtual environment☆34Updated last year
- Sculpt: Structuring unstructured data with LLMs☆33Updated 2 months ago
- arXiv fragment loader plugin for https://llm.datasette.io/☆15Updated last month
- Demo that extends the FastUI example & adds database persistence☆15Updated last year
- Deploying a FastAPI application to Cloudflare Workers with uv.☆62Updated this week
- scraping and querying documents for LLMs☆22Updated 3 weeks ago
- We assessed the ability of popular LLMs to generate accurate and efficient SQL from natural language prompts. Using a 200 million record …☆38Updated 2 weeks ago
- Smart reproducible analytical pipeline inspection☆17Updated 2 months ago
- Dashb.io - Minimalist's Dashboard and Widgets.☆14Updated last year
- Lightweight, open source, locally-hosted Modern Data Stack☆15Updated 2 months ago