ashvardanian / cuda-python-starter-kit
Parallel Computing starter project to build GPU & CPU kernels in CUDA & C++ and call them from Python without a single line of CMake using PyBind11
☆18Updated 5 months ago
Alternatives and similar repositories for cuda-python-starter-kit:
Users that are interested in cuda-python-starter-kit are comparing it to the libraries listed below
- Learning how to write "Less Slow" code in Python, from numerical micro-kernels to coroutines, ranges, and polymorphic state machines☆29Updated 3 weeks ago
- I have no idea what I'm doing , but llm.c in rust☆12Updated 7 months ago
- A list of awesome resources and blogs on topics related to Unum☆34Updated 4 months ago
- GGML implementation of BERT model with Python bindings and quantization.☆53Updated last year
- Lightweight Llama 3 8B Inference Engine in CUDA C☆45Updated last week
- Convert MUSE from TensorFlow to PyTorch and ONNX☆11Updated 8 months ago
- Better bindings for Python☆17Updated 2 years ago
- Tiny Semantic Versioning (SemVer) library with LLMs and GitHub CI, that doesn't depend on 300K lines of JavaScript code and fits in a sin…☆21Updated last month
- Example ML projects that use the Determined library.☆26Updated 5 months ago
- Make triton easier☆44Updated 8 months ago
- PostText is a QA system for querying your text data. When appropriate structured views are in place, PostText is good at answering querie…☆31Updated last year
- No-GIL Python environment featuring NVIDIA Deep Learning libraries.☆43Updated last week
- LLM training in simple, raw C/CUDA☆91Updated 9 months ago
- Learning Unum's efficient data-processing tools one cool project at a time☆11Updated last year
- Training hybrid models for dummies.☆20Updated last month
- ☆22Updated this week
- A minimalistic C++ Jinja templating engine for LLM chat templates☆120Updated this week
- Fast and vectorizable algorithms for searching in a vector of sorted floating point numbers☆130Updated 2 months ago
- Vector Database with support for late interaction and token level embeddings.☆52Updated 4 months ago
- Efficiently computing & storing token n-grams from large corpora☆18Updated 4 months ago
- Chrome Extension for exploring Hugging Face datasets 🔎☆49Updated 5 months ago
- ☆21Updated 3 months ago
- Exploration of Vector database Index for fast approximate nearest neighbour search.☆19Updated 6 months ago
- Showcase how mxbai-embed-large-v1 can be used to produce binary embedding. Binary embeddings enabled 32x storage savings and 40x faster r…☆15Updated 10 months ago
- A file utility for accessing both local and remote files through a unified interface.☆37Updated last month
- FlexAttention w/ FlashAttention3 Support☆26Updated 4 months ago
- Cortex-compatible model server for Python and TensorFlow☆17Updated 2 years ago