An open source community implementation of the model from "DIFFERENTIAL TRANSFORMER" paper by Microsoft.
☆38Feb 9, 2026Updated last month
Alternatives and similar repositories for DifferentialTransformer
Users that are interested in DifferentialTransformer are comparing it to the libraries listed below
Sorting:
- PyTorch implementation of the Differential-Transformer architecture for sequence modeling, specifically tailored as a decoder-only model …☆86Oct 27, 2024Updated last year
- ☆13Oct 14, 2024Updated last year
- Train a production grade GPT in less than 400 lines of code. Better than Karpathy's verison and GIGAGPT☆16Feb 6, 2026Updated last month
- OmegaViT (ΩViT) is a cutting-edge vision transformer architecture that combines multi-query attention, rotary embeddings, state space mod…☆14Feb 9, 2026Updated last month
- Mamba R1 represents a novel architecture that combines the efficiency of Mamba's state space models with the scalability of Mixture of Ex…☆25Oct 13, 2025Updated 4 months ago
- Implementation of the paper: "Aurora: A Foundation Model of the Atmosphere" in PyTorch☆22Feb 9, 2026Updated last month
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆29Mar 3, 2026Updated last week
- Repository dedicated to developing a robust and modular framework for Multi-Agent Reinforcement Learning (MARL) algorithms.☆13Mar 3, 2024Updated 2 years ago
- Implementation of the model from "Faster sorting algorithms discovered using deep reinforcement learning" that discovered an all-new ult…☆11Aug 29, 2023Updated 2 years ago
- This is a simple implementation of Saavedra-Barrera's paper SAAVEDRA-BARRERA R H. CPU Performance Evaluation and Execution Time Predictio…☆10Nov 23, 2021Updated 4 years ago
- decouped imitation for whole-body humanoid natural locomotion☆16Apr 1, 2025Updated 11 months ago
- ☆19Nov 20, 2025Updated 3 months ago
- A plugin from ECMWF/ai-models, with models sourced from PuYun Meteorological Model in Metacarbon (Hangzhou)☆19Sep 30, 2025Updated 5 months ago
- ☆13Jan 2, 2025Updated last year
- Automatic Modulation Classification implemented on different deep learning frameworks☆10Nov 17, 2020Updated 5 years ago
- Code and software used to design de novo protein nanomachines. Supplementary material for "Computational design of nanoscale rotational m…☆10Mar 19, 2022Updated 3 years ago
- Deep learning based automatic modulation classification for sub-carriers of OFDM signals.☆12Jan 3, 2024Updated 2 years ago
- 𝐈𝐬𝐥𝐚𝐦𝐢𝐜𝐓𝐫𝐚𝐧𝐬𝐥𝐚𝐭𝐨𝐫 is an automated solution designed to translate 𝐇𝐚𝐝𝐢𝐭𝐡𝐬 into multiple languages using the power …☆11Jan 17, 2025Updated last year
- ☆11Oct 24, 2024Updated last year
- python RobustRMC projects☆10Apr 22, 2024Updated last year
- An open-source non-official community implementation of the model from the paper: Surgical Robot Transformer (SRT): Imitation Learning fo…☆11Feb 9, 2026Updated last month
- ☆11May 6, 2021Updated 4 years ago
- Isaac Sim 4.5 intellisense and AI context for VSCode & Cursor☆19Aug 13, 2025Updated 6 months ago
- Zeta implementation of a reusable and plug in and play feedforward from the paper "Exponentially Faster Language Modeling"☆16Nov 11, 2024Updated last year
- Implementation of the DSP-2025 paper "A synergistic CNN-transformer network with pooling attention fusion for hyperspectral image classif…☆23Feb 20, 2025Updated last year
- ☆11Aug 28, 2023Updated 2 years ago
- ☆18Oct 12, 2025Updated 4 months ago
- Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta☆13Nov 11, 2024Updated last year
- This repository serves as a central hub for discovering tools and services focused on automated prompt engineering. Whether you're lookin…☆14Oct 11, 2024Updated last year
- ☆12Apr 30, 2024Updated last year
- SimplifiedTransformer simplifies transformer block without affecting training. Skip connections, projection parameters, sequential sub-bl…☆15Feb 6, 2026Updated last month
- semantic tokenizer for speech and music☆21Jul 6, 2025Updated 8 months ago
- bert蒸馏实践,包含BiLSTM蒸馏BERT和TinyBert☆13Apr 23, 2022Updated 3 years ago
- A simple package for leveraging Falcon 180B and the HF ecosystem's tools, including training/inference scripts, safetensors, integrations…☆12Mar 11, 2024Updated last year
- Driver software for the Franka robots.☆13Nov 5, 2025Updated 4 months ago
- 🛠Robust SSH: auto-reconnect SSH session that preserves your running shell and command. Intuitive, no server-side setup, aimed at simplic…☆13Nov 14, 2025Updated 3 months ago
- My implementation of the model KosmosG from "KOSMOS-G: Generating Images in Context with Multimodal Large Language Models"☆14Nov 11, 2024Updated last year
- Fine-tune copilot based on your codebase☆12Mar 26, 2024Updated last year
- Implementation of the Pairformer model used in AlphaFold 3☆14Mar 2, 2026Updated last week