facebookresearch / CATransformersLinks
CATransformers is a framework for joint neural network and hardware architecture search.
☆20Updated 8 months ago
Alternatives and similar repositories for CATransformers
Users that are interested in CATransformers are comparing it to the libraries listed below
Sorting:
- Experimental scripts for researching data adaptive learning rate scheduling.☆22Updated 2 years ago
- ☆73Updated 6 months ago
- Official Implementation of Dynamic erf (Derf).☆126Updated last month
- Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data☆23Updated last year
- MobileLLM-R1☆75Updated 4 months ago
- Implementation of a transformer for reinforcement learning using `x-transformers`☆72Updated 4 months ago
- [CVPR 2025] Parallel Sequence Modeling via Generalized Spatial Propagation Network☆111Updated 6 months ago
- Explorations into improving ViTArc with Slot Attention☆43Updated last year
- RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best…☆59Updated 10 months ago
- Exploration into the Firefly algorithm in Pytorch☆41Updated 11 months ago
- NeuMeta transforms neural networks by allowing a single model to adapt on the fly to different sizes, generating the right weights when n…☆44Updated last year
- Code and pretrained models for the paper: "MatMamba: A Matryoshka State Space Model"☆62Updated last year
- Exploration into the Scaling Value Iteration Networks paper, from Schmidhuber's group☆37Updated last year
- The official repo of continuous speculative decoding☆31Updated 10 months ago
- Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models. TMLR 2025.☆140Updated 4 months ago
- This is a simple torch implementation of the high performance Multi-Query Attention☆16Updated 2 years ago
- Implementation of the proposed DeepCrossAttention by Heddes et al at Google research, in Pytorch☆96Updated 11 months ago
- RS-IMLE☆43Updated last year
- ☆169Updated 4 months ago
- [NeurIPS 2025 Oral] Official Code for Exploring Diffusion Transformer Designs via Grafting☆70Updated 3 weeks ago
- FID computation in Jax/Flax.☆29Updated last year
- Large multi-modal models (L3M) pre-training.☆229Updated 4 months ago
- The Gaussian Histogram Loss (HL-Gauss) proposed by Imani et al. with a few convenient wrappers for regression, in Pytorch☆72Updated 2 months ago
- Evaluate the performance of computer vision models and prompts for zero-shot models (Grounding DINO, CLIP, BLIP, DINOv2, ImageBind, model…☆37Updated 2 years ago
- imagetokenizer is a python package, helps you encoder visuals and generate visuals token ids from codebook, supports both image and video…☆40Updated last year
- This repository provides the official implementation of QSVD, a method for efficient low-rank approximation that unifies Query-Key-Value …☆24Updated 2 months ago
- Fork of Flame repo for training of some new stuff in development☆19Updated 3 weeks ago
- ☆34Updated 7 months ago
- A benchmark dataset and simple code examples for measuring the perception and reasoning of multi-sensor Vision Language models.☆19Updated last year
- Just another reasonably minimal repo for class-conditional training of pixel-space diffusion transformers.☆143Updated 8 months ago