several types of attention modules written in PyTorch for learning purposes
☆53Jan 2, 2026Updated 2 months ago
Alternatives and similar repositories for attention
Users that are interested in attention are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- (Unofficial) PyTorch implementation of grouped-query attention (GQA) from "GQA: Training Generalized Multi-Query Transformer Models from …☆190May 9, 2024Updated last year
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆27Jul 12, 2023Updated 2 years ago
- Developer project for getting basic API integrations working in under 5 minutes☆11Jan 30, 2026Updated 2 months ago
- Re-implementation of Memory Networks (MemNN) paper of Facebook AI Research Lab.☆16May 6, 2020Updated 5 years ago
- ☆11Oct 11, 2023Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- This code repository is the source code of the paper "Deep Long-Range Spatiotemporal Dependency Synthetic Minority Oversampling Technique…☆13Nov 21, 2025Updated 4 months ago
- MLX binary vectors and associated algorithms.☆14Mar 13, 2025Updated last year
- Basic implementation of variational autoencoders in Torch☆10Apr 16, 2016Updated 9 years ago
- Spiking neural networks (SNNs) for speech classification☆12Mar 14, 2022Updated 4 years ago
- InternLM-7B微调, SFT/LoRA, instruction finetune☆13May 17, 2024Updated last year
- When real time Yoga Position classification meets GNN☆11Sep 17, 2023Updated 2 years ago
- ☆11Aug 4, 2022Updated 3 years ago
- The tool facilitates debugging convergence issues and testing new algorithms and recipes for training LLMs using Nvidia libraries such as…☆19Sep 17, 2025Updated 6 months ago
- GC-Net: Global Attention Module and Cascade Fusion Network for Steel Surface Defect Detection☆10Dec 2, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- The open source implementation of the multi grouped query attention by the paper "GQA: Training Generalized Multi-Query Transformer Model…☆15Dec 11, 2023Updated 2 years ago
- An opinionated NLP research template☆10Aug 29, 2024Updated last year
- The source code (Pytorch version) of paper "Multi-modality augmented Prototypical Network for Fault Diagnosis"☆11Aug 26, 2024Updated last year
- Transformer-based autoregressive varitional autoencoder☆12Feb 10, 2020Updated 6 years ago
- Code for data reduction and analysis of Galaxy Zoo 2☆14May 20, 2016Updated 9 years ago
- ☆19Aug 11, 2025Updated 7 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆11Sep 4, 2025Updated 6 months ago
- Graph transport network (GTN), as proposed in "Scalable Optimal Transport in High Dimensions for Graph Distances, Embedding Alignment, an…☆15Apr 26, 2023Updated 2 years ago
- [PR2024] Official code for "MP-TFWA: Multi-schema prompting powered token-feature woven attention network for short text classification"☆13Jul 25, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆22Nov 9, 2024Updated last year
- ☆17Jan 5, 2018Updated 8 years ago
- Official Implementation for the paper "VisCodex: Unified Multimodal Code Generation via Merging Vision and Coding Models"☆22Aug 14, 2025Updated 7 months ago
- [ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Mod…☆39Mar 11, 2024Updated 2 years ago
- Codes for our paper "Exploring Bit-Slice Sparsity in Deep Neural Networks for Efficient ReRAM-Based Deployment" [NeurIPS'19 EMC2 workshop]…☆10Oct 12, 2020Updated 5 years ago
- (ICML 2024) PyTorch implementation of "Self-Attention through Kernel-Eigen Pair Sparse Variational Gaussian Processes"☆16Oct 15, 2024Updated last year
- ☆25May 6, 2025Updated 10 months ago
- Human Activity Recognition with LSTM model and MLFlow Tracking☆11Jun 6, 2022Updated 3 years ago
- Source Code for Online Collective Matrix Factorization Hashing. Reference: Di Wang, Quan Wang, Yaqiang An, Xinbo Gao, and Yumin Tian. 202…☆11Oct 20, 2020Updated 5 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Joint optic disc and optic cup segmentation based on boundary prior and adversarial learning☆15Jan 18, 2023Updated 3 years ago
- TFSCL☆19Jul 25, 2024Updated last year
- Code repository for the 2023 MICCAI Paper "TabAttention: Learning Attention Conditionally on Tabular Data"☆18Oct 30, 2023Updated 2 years ago
- ☆16Jun 14, 2024Updated last year
- Travel time prediction from GPS observations using an HMM☆11Jan 4, 2023Updated 3 years ago
- React全家桶+AntD 共享单车后台管理系统开发☆14Aug 1, 2018Updated 7 years ago
- ☆13Jan 17, 2020Updated 6 years ago