Triton implementation of GPT/LLAMA
☆21Aug 28, 2024Updated last year
Alternatives and similar repositories for gpt-triton
Users that are interested in gpt-triton are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Iterate fast on your RAG pipelines☆24Jun 21, 2025Updated 9 months ago
- 2018/2019/校招/春招/秋招/算法/机器学习(Machine Learning)/深度学习(Deep Learning)/自然语言处理(NLP)/C/C++/Python/面试笔记☆10Oct 31, 2018Updated 7 years ago
- ☆14Jun 24, 2024Updated last year
- ☆11Feb 22, 2025Updated last year
- Because it's there.☆16Sep 22, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Mamba support for transformer lens☆19Sep 17, 2024Updated last year
- KV Cache & LoRA for minGPT☆62Mar 4, 2026Updated last month
- llama2 inference engine in Rust☆13Apr 12, 2024Updated 2 years ago
- This code was used to collect, process, and validate the REFLACX (Reports and Eye-Tracking Data for Localization of Abnormalities in Ches…☆19Apr 6, 2022Updated 4 years ago
- Full End-to-End examples showing how to use First-gen Gaudi and Gaudi2 in common use cases☆13Dec 2, 2024Updated last year
- ☆17Jan 1, 2025Updated last year
- Public course website for Spring 2024 materials.☆16Aug 6, 2024Updated last year
- ☆13Jan 30, 2023Updated 3 years ago
- ☆14Sep 29, 2025Updated 6 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- An SD upscale script made to work with an inpainting model. Supports tiling.☆11Mar 13, 2023Updated 3 years ago
- Minimal implementation of TokenFormer for inference and learning☆13Nov 6, 2024Updated last year
- RBF Drivers for Blender☆10Oct 14, 2022Updated 3 years ago
- ☆51Apr 2, 2026Updated 2 weeks ago
- A Model Context Protocol server that provides documentation access capabilities. This server enables LLMs to search and retrieve content …☆19Apr 29, 2025Updated 11 months ago
- InstaSlice facilitates the use of Dynamic Resource Allocation (DRA) on Kubernetes clusters for GPU sharing☆30Nov 27, 2024Updated last year
- Source code for some notes for the mathematical tripos.☆23Dec 23, 2018Updated 7 years ago
- A hackable, simple, and reseach-friendly GRPO Training Framework with high speed weight synchronization in a multinode environment.☆37Aug 27, 2025Updated 7 months ago
- Repository for ACM India Summer School on Generative AI for Text☆13Jul 11, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Cross-GPU KV Cache Marketplace☆22Nov 12, 2025Updated 5 months ago
- A reproduced PyTorch implementation of the Adversarially Reweighted Learning (ARL) model, originally presented in "Fairness without Demog…☆20Jan 30, 2021Updated 5 years ago
- This comprehensive guide provides a universal process for preparing your own speech datasets and training a custom Text-to-Speech (TTS) m…☆24May 3, 2025Updated 11 months ago
- Vector search using only Parquet and DataFusion☆58Feb 11, 2026Updated 2 months ago
- Writing and Citation Assistant Tool☆38Dec 21, 2025Updated 3 months ago
- Writing FLUX in Triton☆42Sep 22, 2024Updated last year
- A tiny BERT for low-resource monolingual models☆31Dec 24, 2025Updated 3 months ago
- ☆22Jan 10, 2025Updated last year
- FlexTensor is a tensor offloading and management library for PyTorch that enables running large models on limited GPU memory by intellige…☆87Updated this week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Winning solution of the Kaggle "Google Brain - Ventilator Pressure Prediction" competition☆10Nov 12, 2021Updated 4 years ago
- 🔊Replicate Cog'ified MMAudio🎵☆18Jul 10, 2025Updated 9 months ago
- Phoshell: a Forth inspired, extremely lightweight, stack machine shell, implementable in _ALL_ known programming languages.☆10Nov 21, 2020Updated 5 years ago
- Approaching Clinical NER as a MRC problem☆11Apr 4, 2024Updated 2 years ago
- Resources for private and secure Machine Learning and Artificial Intelligence☆13Jun 13, 2022Updated 3 years ago
- Shaping capabilities with token-level pretraining data filtering☆94Jan 28, 2026Updated 2 months ago
- ☆21Jul 2, 2022Updated 3 years ago