imoneoi / bf16_fused_adamView external linksLinks
BFloat16 Fused Adam Operator for PyTorch
☆16Nov 16, 2024Updated last year
Alternatives and similar repositories for bf16_fused_adam
Users that are interested in bf16_fused_adam are comparing it to the libraries listed below
Sorting:
- One RL Platform is all you need -- Event-driven fully distributed reinforcement learning framework☆21Oct 25, 2023Updated 2 years ago
- ☆11Dec 22, 2024Updated last year
- Code for paper Evolving Connectivity for Spiking Neural Networks☆23Oct 23, 2023Updated 2 years ago
- ☆21Apr 1, 2024Updated last year
- ☆21Sep 3, 2024Updated last year
- Mixed precision training from scratch with Tensors and CUDA☆28May 14, 2024Updated last year
- Official implementation for DenseMixer: Improving MoE Post-Training with Precise Router Gradient☆66Aug 3, 2025Updated 6 months ago
- Transformers components but in Triton☆34May 9, 2025Updated 9 months ago
- Repository for go shared libraries (for now).☆11Dec 1, 2025Updated 2 months ago
- Using FlexAttention to compute attention with different masking patterns☆47Sep 22, 2024Updated last year
- Enhancing Domain Adaptation through Prompt Gradient Alignment (NeurIPS 2024)☆14Jun 16, 2024Updated last year
- An attempt at a SVD inpainting pipeline☆50Dec 24, 2023Updated 2 years ago
- Streamline on-policy/off-policy distillation workflows in a few lines of code☆95Feb 5, 2026Updated last week
- ☆13Jun 18, 2024Updated last year
- ☆26Oct 16, 2025Updated 3 months ago
- Ling-Coder-Lite is a MoE LLM provided and open-sourced by CodeFuse and InclusionAI.☆14Apr 22, 2025Updated 9 months ago
- ☆12Apr 26, 2024Updated last year
- Visual Question Answering System☆11Nov 13, 2019Updated 6 years ago
- ALAS: Autonomous Learning Agent System☆14Aug 14, 2025Updated 6 months ago
- LongAttn :Selecting Long-context Training Data via Token-level Attention☆15Jul 16, 2025Updated 6 months ago
- Cuda extensions for PyTorch☆12Dec 2, 2025Updated 2 months ago
- ☆19Oct 4, 2024Updated last year
- JSON encoder and decoder for python written in C/C++☆10Jan 22, 2024Updated 2 years ago
- Python bindings for NVIDIA CUDA APIs.☆13Mar 2, 2024Updated last year
- Python implementation of MATLAB's msalign function☆11Jan 19, 2026Updated 3 weeks ago
- Self-contained Python lib with zero-dependencies that give you a unified device properties for gpu, cpu, and npu. No more calling separat…☆14Dec 12, 2025Updated 2 months ago
- FFT for PyCuda and PyOpenCL. The package is deprecated and its functionality is merged into Reikna.☆37Feb 17, 2014Updated 11 years ago
- Fork of HyenaDNA, a long-range genomic foundation model built with Hyena☆10Aug 14, 2023Updated 2 years ago
- Code for reproducing our paper: LMSOC: An Approach for Socially Sensitive Pretraining☆13Oct 22, 2021Updated 4 years ago
- OLD REPOSITORY, new one at repo.rumpkernel.org/rumprun☆44Apr 13, 2015Updated 10 years ago
- Inference code for LLaMA models☆41Mar 13, 2023Updated 2 years ago
- Python class to explore the ImageNet database☆16Jan 12, 2012Updated 14 years ago
- Unofficial implementation for Sigmoid Loss for Language Image Pre-Training☆11Sep 26, 2023Updated 2 years ago
- Minimilast Redis Client for Erlang☆19Jul 15, 2013Updated 12 years ago
- Trying Tigerbeetle transactional database.☆11Jul 14, 2024Updated last year
- ☆11Dec 9, 2025Updated 2 months ago
- 本项目提供了面向中文的XLNet预训练模型,旨在丰富中文自然语言处理资源,提供多元化的中文预训练模型选择。 我们欢迎各位专家学者下载使用,并共同促进和发展中文资源建设。☆11May 30, 2023Updated 2 years ago
- ☆16Feb 6, 2024Updated 2 years ago
- ☆11Nov 9, 2022Updated 3 years ago