fishmingyu / GeoT
GeoT: Tensor Centric Library for Graph Neural Network via Efficient Segment Reduction on GPU
☆17Updated 3 weeks ago
Related projects: ⓘ
- A scalable implementation of diffusion and flow-matching with XGBoost models, applied to calorimeter data.☆13Updated 3 weeks ago
- A place to store reusable transformer components of my own creation or found on the interwebs☆43Updated 3 weeks ago
- Memory Optimizations for Deep Learning (ICML 2023)☆58Updated 6 months ago
- Using FlexAttention to compute attention with different masking patterns☆28Updated last week
- ☆26Updated last year
- Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.☆34Updated 2 months ago
- DL Dataloader Benchmarks☆18Updated last week
- Experimental scripts for researching data adaptive learning rate scheduling.☆23Updated 11 months ago
- ML/DL Math and Method notes☆56Updated 9 months ago
- Utilities for Training Very Large Models☆56Updated last week
- Simple and fast low-bit matmul kernels in CUDA☆48Updated this week
- Automatically take good care of your preemptible TPUs☆28Updated last year
- ☆48Updated 3 months ago
- machine learning model performance metrics & charts with confidence intervals, optimized with numba to be fast☆16Updated 2 years ago
- Lightning support for Intel Habana accelerators.☆25Updated 2 weeks ago
- Make triton easier☆39Updated 3 months ago
- PyTorch centric eager mode debugger☆43Updated 2 months ago
- Distributed ML Optimizer☆31Updated 3 years ago
- ☆10Updated 9 months ago
- PyTorch Implementation of the paper "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"☆21Updated last week
- Implementation of the Kalman Filtering Attention proposed in "Kalman Filtering Attention for User Behavior Modeling in CTR Prediction"☆56Updated 10 months ago
- DAM Data Acquisition for ML Benchmark, as part of the DataPerf benchmark suite, https://dataperf.org/☆22Updated last year
- Code for the paper "On the Expressivity Role of LayerNorm in Transformers' Attention" (Findings of ACL'2023)☆43Updated last year
- Code for paper: "Privately generating tabular data using language models".☆14Updated last year
- companion code for "Learning to substitute Ingredients in Recipes"☆23Updated last year
- Implementation of Hyena Hierarchy in JAX☆10Updated last year
- Code repository for the paper - "AdANNS: A Framework for Adaptive Semantic Search"☆57Updated 11 months ago
- Implementation of some personal helper functions for Einops, my most favorite tensor manipulation library ❤️☆52Updated last year
- Benchmarking PyTorch 2.0 different models☆20Updated last year
- benchmarking some transformer deployments☆26Updated last year