facebookresearch / MobileLLM-R1Links
MobileLLM-R1
☆75Updated 4 months ago
Alternatives and similar repositories for MobileLLM-R1
Users that are interested in MobileLLM-R1 are comparing it to the libraries listed below
Sorting:
- The official repo of continuous speculative decoding☆31Updated 10 months ago
- ☆40Updated 4 months ago
- This is a PyTorch implementation of the paperViP A Differentially Private Foundation Model for Computer Vision☆36Updated 2 years ago
- Pytorch implementation of HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models☆28Updated last year
- research work on multimodal cognitive ai☆68Updated last month
- Description and applications of OpenAI's paper about DALL-E (2021) and implementation of other (CLIP-guided) zero-shot text-to-image gene…☆33Updated 3 years ago
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆31Updated 8 months ago
- Official repository for ICML 2024 paper "MoRe Fine-Tuning with 10x Fewer Parameters"☆22Updated 3 months ago
- [ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"☆17Updated 10 months ago
- [ICLR 2025 & COLM 2025] Official PyTorch implementation of the Forgetting Transformer and Adaptive Computation Pruning☆137Updated last month
- Code for experiments for "ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy"☆102Updated last year
- NeuMeta transforms neural networks by allowing a single model to adapt on the fly to different sizes, generating the right weights when n…☆44Updated last year
- Code for NOLA, an implementation of "nola: Compressing LoRA using Linear Combination of Random Basis"☆57Updated last year
- ☆169Updated 4 months ago
- 😊 TPTT: Transforming Pretrained Transformers into Titans☆54Updated 2 months ago
- implementation of dualformer☆24Updated 11 months ago
- Official implementation of ECCV24 paper: POA☆24Updated last year
- 32 times longer context window than vanilla Transformers and up to 4 times longer than memory efficient Transformers.☆50Updated 2 years ago
- This is a simple torch implementation of the high performance Multi-Query Attention☆16Updated 2 years ago
- This repository is the implementation of the paper Training Free Pretrained Model Merging (CVPR2024).☆32Updated last year
- [ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxia…☆29Updated 6 months ago
- [NeurIPS'24 LanGame workshop] On The Planning Abilities of OpenAI's o1 Models: Feasibility, Optimality, and Generalizability☆41Updated 6 months ago
- Experimental scripts for researching data adaptive learning rate scheduling.☆22Updated 2 years ago
- [ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models☆35Updated last year
- Inference Speed Benchmark for Learning to (Learn at Test Time): RNNs with Expressive Hidden States☆81Updated last year
- Measuring the Signal to Noise Ratio in Language Model Evaluation☆28Updated 5 months ago
- ☆19Updated last year
- Implementation of a multimodal diffusion transformer in Pytorch☆107Updated last year
- Official Code Repository for EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents (COLM 2024)☆39Updated last year
- [ICML'24 Oral] The official code of "DiJiang: Efficient Large Language Models through Compact Kernelization", a novel DCT-based linear at…☆104Updated last year