YihongDong/FANformer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/YihongDong/FANformer)

YihongDong / FANformer

☆39

Alternatives and similar repositories for FANformer

Users that are interested in FANformer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jiangxxxue / ROCODE
View on GitHub
This repository hosts the source code for the paper "ROCODE: Integrating Backtracking Mechanism and Program Analysis in Large Language Mo…
☆16Dec 16, 2025Updated 7 months ago
YihongDong / CDD-TED4LLMs
View on GitHub
☆16Nov 26, 2024Updated last year
YihongDong / FAN
View on GitHub
☆264Oct 26, 2025Updated 9 months ago
Doraemonzzz / xmixers
View on GitHub
Xmixers: A collection of SOTA efficient token/channel mixers
☆29Sep 4, 2025Updated 10 months ago
YihongDong / RL-PLUS
View on GitHub
☆27Aug 31, 2025Updated 10 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
siyuanseever / llama2Rnn.c
View on GitHub
☆13Apr 15, 2024Updated 2 years ago
jungokasai / T2R
View on GitHub
☆14Nov 20, 2022Updated 3 years ago
LUMIA-Group / PonderingLM
View on GitHub
Official implementation of the paper "Pretraining Language Models to Ponder in Continuous Space"
☆26Jul 21, 2025Updated last year
dearlexie1128 / Graph-MoE
View on GitHub
☆10Dec 17, 2024Updated last year
fmorenopino / SAAM
View on GitHub
Spectral Attention Autoregressive Model (SAAM)
☆17Oct 27, 2022Updated 3 years ago
lpigeon / ros-skill
View on GitHub
Agent Skill for ROS/ROS2 robot control via rosbridge WebSocket.
☆24Feb 27, 2026Updated 5 months ago
maximzubkov / fft-scan
View on GitHub
Efficient PScan implementation in PyTorch
☆17Jan 2, 2024Updated 2 years ago
emalach / LinearLM
View on GitHub
Code for the paper: https://arxiv.org/pdf/2309.06979.pdf
☆21Jul 29, 2024Updated 2 years ago
idiap / hypermixing
View on GitHub
PyTorch implementation for HyperMixing, a linear-time token-mixing technique used in HyperMixer architecture
☆26Jun 12, 2023Updated 3 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
huangruizhe / audio
View on GitHub
Data manipulation and transformation for audio signal processing, powered by PyTorch
☆10Sep 30, 2024Updated last year
fla-org / fla-zoo
View on GitHub
Flash-Linear-Attention models beyond language
☆21Aug 28, 2025Updated 11 months ago
NX-AI / mlstm_kernels
View on GitHub
Tiled Flash Linear Attention library for fast and efficient mLSTM Kernels.
☆91Jul 6, 2026Updated 3 weeks ago
KirigiriSuzumiya / crowd_vis
View on GitHub
使用django+pyecharts+PP-Human开发的动态数据大屏, 有人流数据的采集入库, 打架、摔倒等事件警报，口罩检测等实用功能。边缘端版本使用onnx推理提升效率，服务端版本支持视频流推拉
☆34May 3, 2023Updated 3 years ago
PurCL / ProSec
View on GitHub
Official repo for "ProSec: Fortifying Code LLMs with Proactive Security Alignment"
☆18Feb 26, 2026Updated 5 months ago
dangxingyu / rnn-icrag
View on GitHub
Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"
☆27Apr 17, 2024Updated 2 years ago
iPersevere / CAPTCHA_Recognition
View on GitHub
这是本科毕业设计的课题，“基于深度网络的网站验证码识别研究与实现”。主要是利用卷积神经网络，基于TensorFlow平台，构建了三层卷积两层全联接模型，训练出的一个准确率为91.3%的识别模型。再基于Django构建登陆系统，使用selenium实现自动测试，完成验证码从识…
☆25Jun 18, 2018Updated 8 years ago
VisualJoyce / ChengyuBERT
View on GitHub
[COLING 2020] BERT-based Models for Chengyu
☆17Dec 29, 2021Updated 4 years ago
mechanistic-interpretability-grokking / progress-measures-paper
View on GitHub
☆94Oct 11, 2022Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
HFAiLab / pytorch_distributed
View on GitHub
The test of different distributed-training methods on High-Flyer AIHPC
☆27Oct 18, 2022Updated 3 years ago
RUCBM / ICLEval
View on GitHub
☆14Jun 24, 2024Updated 2 years ago
phueb / BabyBERTa
View on GitHub
Source code for CoNLL 2021 paper by Huebner et al. 2021
☆21Jul 13, 2023Updated 3 years ago
kazuki-irie / kv-memory-brain
View on GitHub
Official Code Repository for the paper "Key-value memory in the brain"
☆32Feb 25, 2025Updated last year
mcleish7 / retrofitting-recurrence
View on GitHub
Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence
☆68Nov 11, 2025Updated 8 months ago
IBM / larimar
View on GitHub
Code for ICML 2024 paper
☆34Sep 18, 2025Updated 10 months ago
RodkinIvan / associative-recurrent-memory-transformer
View on GitHub
[ICML 24 NGSM workshop] Associative Recurrent Memory Transformer implementation and scripts for training and evaluation
☆67Mar 12, 2026Updated 4 months ago
google-deepmind / spectral_ssm
View on GitHub
☆35Apr 12, 2024Updated 2 years ago
domaineval / DomainEval
View on GitHub
DOMAINEVAL is an auto-constructed benchmark for multi-domain code generation that consists of 2k+ subjects (i.e., description, reference …
☆13Dec 12, 2024Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
InuyashaYang / AIDIY
View on GitHub
JoinAI是一个开源仓库，专注于算法工程能力的培养，包括工程和数学原理的整理
☆11Apr 20, 2025Updated last year
jwardsmith / Red-Team-Scripts
View on GitHub
Collection of red team scripts, resources & configs.
☆15Feb 14, 2026Updated 5 months ago
marcelbinz / meta-learned-models
View on GitHub
☆13Mar 21, 2023Updated 3 years ago
chen-hao-chao / mdm-prime-v2
View on GitHub
MDM-Prime-v2: Binary Encoding and Index Shuffling Enable Scaling of Diffusion Language Models
☆27May 23, 2026Updated 2 months ago
RobertCsordas / ndr
View on GitHub
The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization".
☆34Jun 11, 2025Updated last year
0xD4rky / nanotok
View on GitHub
☆27Jun 7, 2026Updated last month
CarloP95 / Text2VideoGAN
View on GitHub
A pytorch implementation of a text to videos GAN
☆12Jul 26, 2019Updated 7 years ago