JINO-ROHIT/fastcv

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/JINO-ROHIT/fastcv)

JINO-ROHIT / fastcv

fastcv is a CUDA rewrite of the opencv filters with python bindings

☆73

Alternatives and similar repositories for fastcv

Users that are interested in fastcv are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

JINO-ROHIT / kernels
View on GitHub
writing really fast kernels
☆19Jul 15, 2026Updated 2 weeks ago
JINO-ROHIT / Tune-RAG-Parameters-With-LlamaIndex
View on GitHub
☆18Jun 26, 2024Updated 2 years ago
JINO-ROHIT / ml-math-in-depth
View on GitHub
☆15Jul 25, 2025Updated last year
JINO-ROHIT / nano-paged-attention
View on GitHub
a minimal paged attention implementation
☆20Jan 30, 2026Updated 5 months ago
Better-Call-Paul / blackwell_gemm
View on GitHub
☆19Apr 26, 2026Updated 3 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
JINO-ROHIT / tachyon
View on GitHub
a LLM inference engine to run on consumer hardware
☆47Apr 15, 2026Updated 3 months ago
Electron-Labs / cw-eth2-lc
View on GitHub
Eth2 light client implementation as a cosmwasm smart contract
☆11Nov 24, 2023Updated 2 years ago
YJMSTR / flash-linear-attention
View on GitHub
FLA but cuTile
☆27Apr 17, 2026Updated 3 months ago
camenduru / MoE-LLaVA-jupyter
View on GitHub
☆17Feb 1, 2024Updated 2 years ago
xlite-dev / qwen-image-fast
View on GitHub
⚡️Qwen-Image 4.8x🎉 speedup with Hybrid Acceleration for low VRAM GPUs
☆17Oct 24, 2025Updated 9 months ago
KuangjuX / cuda-evolve-oss
View on GitHub
Autonomous GPU kernel optimization system driven by AI agents.
☆31Mar 29, 2026Updated 4 months ago
TiledTensor / TiledBench
View on GitHub
Benchmark tests supporting the TiledCUDA library.
☆19Nov 19, 2024Updated last year
JINO-ROHIT / inferGPT
View on GitHub
a simple c++ inference engine for gpt based architecture
☆40Dec 10, 2025Updated 7 months ago
zhaochenyang20 / sglang-diffusion-routing
View on GitHub
A demonstrative example of running SGLang Diffusion with DP router
☆17Mar 15, 2026Updated 4 months ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
YuvrajSingh-mist / smolcluster
View on GitHub
An educational distributed training and inference library for neural nets using local computing
☆74Jun 10, 2026Updated last month
SaschaWillems / VulkanTemplate2D
View on GitHub
Testing a light Vulkan 1.3 abstraction for 2D games
☆11May 16, 2026Updated 2 months ago
Infatoshi / docs.md
View on GitHub
☆92Dec 16, 2025Updated 7 months ago
tile-ai / AttentionEngine
View on GitHub
☆52May 19, 2025Updated last year
microsoft / TileIR
View on GitHub
☆31Feb 28, 2025Updated last year
JINO-ROHIT / advanced_ml
View on GitHub
☆132Dec 9, 2025Updated 7 months ago
abdimoallim / cuda-utils
View on GitHub
Collection of utilities for CUDA programming
☆18Aug 4, 2025Updated 11 months ago
chene77 / RobartsICP
View on GitHub
ICP implementations including Robust-ICP and Anisotropic-Scaled ICP
☆12Feb 17, 2015Updated 11 years ago
RightNow-AI / rightnow-cli
View on GitHub
Claude Code for CUDA. Free AI assistant that actually understands GPU architecture
☆112Oct 10, 2025Updated 9 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
suoarski / InitialisingEarth
View on GitHub
Tectonic Plates simulation of earth as part of my research project
☆11Dec 14, 2022Updated 3 years ago
opendatahub-io / vllm-tgis-adapter
View on GitHub
vLLM adapter for a TGIS-compatible gRPC server.
☆56Updated this week
dsl-learn / cuda-magic
View on GitHub
fake CUTLASS to get peformance
☆26Apr 28, 2026Updated 3 months ago
melonedo / algebraic-layouts
View on GitHub
☆23Aug 20, 2025Updated 11 months ago
Dao-AILab / gemm-cublas
View on GitHub
☆22May 5, 2025Updated last year
postmalloc / skeletonide
View on GitHub
Skeletonide is a parallel implementation of Zhang-Suen morphological thinning algorithm written in Halide-lang. Use it for fast skeletoni…
☆14Oct 21, 2020Updated 5 years ago
WanliZhong / IntAttention
View on GitHub
Official codebase for the MLSys 2026 paper "IntAttention: A Fully Integer Attention Pipeline for Efficient Edge Inference". It enables hi…
☆19May 29, 2026Updated 2 months ago
Maharshi-Pandya / gpu-stuff
View on GitHub
Repository for GPU related kernels for learning/testing purposes
☆19May 27, 2026Updated 2 months ago
mandliya / PMPP_notes
View on GitHub
Notes and code for Programming Massively Parallel Processors
☆13Mar 29, 2025Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
HydraQYH / hp_rms_norm
View on GitHub
High performance RMSNorm Implement by using SM Core Storage(Registers and Shared Memory)
☆30Jan 22, 2026Updated 6 months ago
aikitoria / nanotrace
View on GitHub
Low overhead tracing library and trace visualizer for pipelined CUDA kernels
☆136Jul 17, 2026Updated last week
fal-ai / flashpack
View on GitHub
High-throughput tensor loading for PyTorch
☆260Updated this week
Zurinlakdawala91 / Career-Recommendation-System-using-ML
View on GitHub
☆10Feb 6, 2025Updated last year
0xSero / deepseek-v4-flash-sm120
View on GitHub
☆32Apr 26, 2026Updated 3 months ago
LinB203 / FSDP-Training
View on GitHub
Minimal PyTorch implementation of TP, SP, FSDP and sharded-EMA
☆32Nov 27, 2025Updated 8 months ago
aniketmaurya / Agents
View on GitHub
Build Agentic workflows with function calling using open LLMs
☆27Jul 6, 2026Updated 3 weeks ago