firechecking/CleanTransformer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/firechecking/CleanTransformer)

firechecking / CleanTransformer

an implementation of transformer, bert, gpt, and diffusion models for learning purposes

☆159

Alternatives and similar repositories for CleanTransformer

Users that are interested in CleanTransformer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

heathcliff233 / fastMSA
View on GitHub
☆13Nov 12, 2021Updated 4 years ago
yeliu918 / HETFORMER
View on GitHub
This is the repository of Heterogeneous Transformer with Sparse Attention forLong-Text Extractive Summarization
☆15Nov 23, 2021Updated 4 years ago
wizard1203 / FuseFL
View on GitHub
FuseFL: One-Shot Federated Learning through the Lens of Causality with Progressive Model Fusion (NeurIPS 2024 Spotlight)
☆15Mar 31, 2025Updated last year
sirius-image-inpainting / Free-Form-Image-Inpainting-With-Gated-Convolution
View on GitHub
https://arxiv.org/pdf/1806.03589v2.pdf
☆11Mar 24, 2021Updated 5 years ago
ngocbh / trimkv
View on GitHub
[TrimKV] Cache What Lasts: Token Retention for Memory-Bounded KV Cache in LLMs - [DBTrimKV] Make Each Token Count: Towards Improving Lo…
☆15May 13, 2026Updated 2 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
antgroup / Agent3Sigma-Canary
View on GitHub
Agent3σ-Canary is an evaluation framework for AI Agent security in realistic runtime environments.
☆33Jun 24, 2026Updated last month
sanowl / Self-Correcting-LLM--Reinforcement-Learning-
View on GitHub
This my attempt to create Self-Correcting-LLM based on the paper Training Language Models to Self-Correct via Reinforcement Learning by g…
☆37Jul 9, 2025Updated last year
NJUNLP / AdaR
View on GitHub
☆15Dec 8, 2025Updated 7 months ago
ankitp94 / relationship-extraction
View on GitHub
Implementation Project for relation extraction in NLP using kernel based methods.
☆33Aug 10, 2016Updated 9 years ago
dovolopor-research / cnlp
View on GitHub
🔥 专注于中文的「自然语言处理框架」：中文分词；平衡类别；数据集划分...
☆12Nov 14, 2020Updated 5 years ago
ShulinCao / OpenKE-PyTorch
View on GitHub
☆12Mar 1, 2018Updated 8 years ago
DLLXW / baby-llama2-chinese
View on GitHub
用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库；24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.
☆2,922May 21, 2024Updated 2 years ago
Vincent131499 / CNN-sentiment-classification-tf
View on GitHub
基于CNN网络对英文文本进行情感分类，采用tensorflow工具
☆10Aug 29, 2018Updated 7 years ago
LeeeeoLiu / U-NEED
View on GitHub
☆19Feb 25, 2023Updated 3 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
yegcjs / mixinglaws
View on GitHub
☆113Jul 15, 2025Updated last year
lyhue1991 / torchkeras
View on GitHub
Pytorch❤️ Keras 😋😋
☆2,008Mar 18, 2026Updated 4 months ago
HarderThenHarder / transformers_tasks
View on GitHub
⭐️ NLP Algorithms with transformers lib. Supporting Text-Classification, Text-Generation, Information-Extraction, Text-Matching, RLHF, SF…
☆2,420Sep 29, 2023Updated 2 years ago
zhangxy-2019 / critique-GRPO
View on GitHub
[ICML 2026 Spotlight] Critique-GRPO: Advancing LLM Reasoning with Natural Language and Numerical Feedback
☆70Jun 3, 2026Updated last month
km1994 / LLMsNineStoryDemonTower
View on GitHub
【LLMs九层妖塔】分享 LLMs在自然语言处理（ChatGLM、Chinese-LLaMA-Alpaca、小羊驼 Vicuna、LLaMA、GPT4ALL等）、信息检索（langchain）、语言合成、语言识别、多模态等领域（Stable Diffusion、MiniGP…
☆2,165Mar 30, 2024Updated 2 years ago
Bruce-Lee-LY / matrix_multiply
View on GitHub
Several common methods of matrix multiplication are implemented on CPU and Nvidia GPU using C++11 and CUDA.
☆14Feb 8, 2023Updated 3 years ago
liangyuwang / Tiny-Megatron
View on GitHub
Tiny-Megatron, a minimalistic re-implementation of the Megatron library
☆32Sep 1, 2025Updated 10 months ago
BitSecret / HyperGNet
View on GitHub
Geometric Problem Solving Integrating FormalGeo Symbolic System and Hypergraph Neural Network.
☆16Sep 23, 2025Updated 10 months ago
XinyuHua / dyploc-acl2021
View on GitHub
Official repository for "DYPLOC: Dynamic Planning of Content Using Mixed Language Models for Opinion Text Generation"
☆10May 20, 2022Updated 4 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
FranxYao / Retrieval-Head-with-Flash-Attention
View on GitHub
Efficient retrieval head analysis with triton flash attention that supports topK probability
☆13Jun 15, 2024Updated 2 years ago
AbdullaDesmal / TBIM
View on GitHub
This repository shows the implementation of the Trained Born Iterative Method (TBIM) applied for electromagnetic imaging.
☆12Nov 9, 2022Updated 3 years ago
jiayoujiayoujiayoua / Hung-yi-Lee-ML-Homework
View on GitHub
☆10Oct 4, 2022Updated 3 years ago
liucongg / ChatGLM-Finetuning
View on GitHub
基于ChatGLM-6B、ChatGLM2-6B、ChatGLM3-6B模型，进行下游具体任务微调，涉及Freeze、Lora、P-tuning、全参微调等
☆2,774Dec 12, 2023Updated 2 years ago
deepsuperviser / CTFN
View on GitHub
This is the code for Coupled-translation Fusion Network.
☆11Dec 2, 2021Updated 4 years ago
LorenzoGianassi / Land-Diffuser
View on GitHub
The Land-Diffuser is a novel application of the Denoising Diffusion Probabilistic Model (DDPM) in the realm of 3D Talking Head generation…
☆13Dec 23, 2023Updated 2 years ago
song-wx / SIFT
View on GitHub
[ICML2024 Spotlight] Fine-Tuning Pre-trained Large Language Models Sparsely
☆24Jun 26, 2024Updated 2 years ago
rashad101 / SGPT-SPARQL-query-generation
View on GitHub
PyTorch code for the IEEE Access paper: SGPT: A Generative Approach for SPARQL Query Generation from Natural Language Questions
☆13Sep 15, 2024Updated last year
sramshetty / mixture-of-depths
View on GitHub
An unofficial implementation of "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"
☆35Jun 7, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
BUPT-GAMMA / PolarGate
View on GitHub
☆15Oct 8, 2024Updated last year
zhliu0106 / learning-to-refuse
View on GitHub
Official Implementation of "Learning to Refuse: Towards Mitigating Privacy Risks in LLMs"
☆10Dec 13, 2024Updated last year
nanxuanzhao / Good_transfer
View on GitHub
Pretrained models for "What Makes Instance Discrimination Good for Transfer Learning?".
☆29Jul 13, 2022Updated 4 years ago
rivia7 / faster-bert-as-service
View on GitHub
Using TensorRT and Triton Server to build BERT model as a service
☆13Jan 10, 2022Updated 4 years ago
debayan / sigir2022-sparqlbaselines
View on GitHub
☆14Oct 28, 2022Updated 3 years ago
thu-coai / BARREL
View on GitHub
[ICLR 2026] BARREL: Boundary-Aware Reasoning for Factual and Reliable LRMs
☆18May 21, 2025Updated last year
xuanhongrui / KMCLR
View on GitHub
☆15Mar 5, 2023Updated 3 years ago