Transformer是谷歌在17年发表的Attention Is All You Need 中使用的模型,经过这些年的大量的工业使用和论文验证,在深度学习领域已经占据重要地位。Bert就是从Transformer中衍生出来的语言模型。我会以中文翻译英文为例,来解释Transformer输入到输出整个流程。
☆294Apr 24, 2024Updated last year
Alternatives and similar repositories for transformer
Users that are interested in transformer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 📝 Summary of recommendation, advertising and search models.【推广搜技术汇总⭐】☆25Feb 2, 2023Updated 3 years ago
- A Foundation Model for Battery Discharge Capacity Degradation Forecasting☆15Jan 7, 2026Updated 3 months ago
- Transformer: PyTorch Implementation of "Attention Is All You Need"☆4,515Jul 15, 2025Updated 9 months ago
- A streamable speech recognition model with transformer encoders and RNN-T loss☆11Mar 1, 2021Updated 5 years ago
- 利用Transformer模型实现的机器翻译☆12Dec 6, 2020Updated 5 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Diffusion Models Tutorials☆15Apr 10, 2023Updated 3 years ago
- ☆22Mar 27, 2026Updated 3 weeks ago
- Implementation of SortingHat: Efficient Private Decision Tree Evaluation via Homomorphic Encryption and Transciphering.☆21Aug 7, 2022Updated 3 years ago
- Papers about keyphrase generation and extraction☆29Nov 16, 2025Updated 5 months ago
- ☆11Jan 24, 2024Updated 2 years ago
- AnyEnhance-based Baseline for the CCF-AATC 2025 Challenge Track 1☆50Dec 27, 2025Updated 3 months ago
- An annotated implementation of the Transformer paper.☆7,193Apr 7, 2024Updated 2 years ago
- The code of paper Compressing Deep Graph Neural Networks via Adversarial Knowledge Distillation. Huarui He, Jie Wang, Zhanqiu Zhang, Fen…☆41Oct 18, 2022Updated 3 years ago
- Semi-supervised learning with Generative Adversarial Networks (GANs) using Kolmogorov-Arnold Network Layers (KANLs)☆19Aug 3, 2024Updated last year
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Code for the paper:<LARNet:Lie Algebra Residual Network for Profile Face Recognition>(ICML2021)☆10Aug 19, 2021Updated 4 years ago
- Open domain Chinese dialogue corpus and datasets.☆17Jan 8, 2022Updated 4 years ago
- A MIDI processing tool that can transfer MIDI to tokens and vice versa. Still evolving...☆17Dec 10, 2023Updated 2 years ago
- Created a simple neural network using C++17 standard and the Eigen library that supports both forward and backward propagation.☆11Jul 27, 2024Updated last year
- ☆20Dec 24, 2024Updated last year
- A translation model with Transformer, implement by pytorch, which is for learning Transformer.☆53Mar 30, 2021Updated 5 years ago
- ☆14Sep 14, 2021Updated 4 years ago
- Review for Course 100396,Tongji Uni.☆37Mar 13, 2022Updated 4 years ago
- Code for "Efficient Relation-aware Scoring Function Search for Knowledge Graph Embedding" ICDE 2021☆11Apr 26, 2021Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- (NBCE)Naive Bayes-based Context Extension on ChatGLM-6b☆15Jun 7, 2023Updated 2 years ago
- 基于Transform的机器翻译系统☆21Jun 1, 2020Updated 5 years ago
- Local DeepSearch (Advantage: Low Threshold): an implementation of Agentic RAG based on DeepSeek-R1 API and Tavily API☆17Jun 21, 2025Updated 9 months ago
- A Retrieval-Augmented Gaussian Mixture Variational Auto-Encoder for Language Modeling☆15Dec 5, 2023Updated 2 years ago
- ☆13May 18, 2022Updated 3 years ago
- Add: some new features according to my need. A vscode extension to add index number to your markdown title.☆11Feb 6, 2018Updated 8 years ago
- Open, royalty free, lyrics2song / song generation data collection / cleaning pipeline.☆17May 9, 2025Updated 11 months ago
- 主要是dbscan算法☆11Jun 28, 2018Updated 7 years ago
- simplest online-softmax notebook for explain Flash Attention☆16Jan 27, 2026Updated 2 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Official Implementation of NEDS☆16Mar 20, 2026Updated 3 weeks ago
- Hiearchical Grid Refinement (HiGRID): DOA Estimation using Rigid Spherical Microphone Arrays☆12Apr 11, 2019Updated 7 years ago
- 本项目使用Keras实现R-BERT,在人物关系数据集上进行测试验证。☆10Apr 17, 2021Updated 5 years ago
- ☆11Jun 27, 2021Updated 4 years ago
- This is the codebase for our ICRA 2020 submission, GraphRQI: Classifying Driver Behaviors Using Graph Spectrums.☆13Dec 8, 2019Updated 6 years ago
- Data for EACL 2023 paper "A Survey on Recent Advances in Keyphrase Extraction from Pre-trained Language Models".☆45Dec 23, 2023Updated 2 years ago
- 在sts数据集上用多头注意力机制上进行测试。 pytorch torchtext 代码简练,非常适合新手了解多头注意力机制的运作。不想transformer牵扯很多层 multi-head attention + one layer linear☆19Aug 20, 2025Updated 7 months ago