Official repository for FLAME-MoE: A Transparent End-to-End Research Platform for Mixture-of-Experts Language Models
☆38Sep 19, 2025Updated 7 months ago
Alternatives and similar repositories for FLAME-MoE
Users that are interested in FLAME-MoE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ACL 2024 Findings] Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning☆13Sep 2, 2024Updated last year
- [ACL 2025] Official code for ''Learning to Reason from Feedback at Test-Time''.☆14May 16, 2025Updated 11 months ago
- ☆26May 26, 2024Updated last year
- GlotEval: a unified evaluation toolkit designed to benchmark multilingual Large Language Models (LLMs) in a language-specific way☆18Nov 4, 2025Updated 6 months ago
- [ICLR'25] Code for KaSA, an official implementation of "KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models"☆22Jan 16, 2025Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Web archiving utility library☆11Updated this week
- Github repo for ICLR-2025 paper, Fine-tuning Large Language Models with Sparse Matrices☆25Feb 2, 2026Updated 3 months ago
- LOLA: Large and Open Source Multilingual Language Model☆11Apr 8, 2026Updated last month
- ☆21Feb 5, 2024Updated 2 years ago
- ☆19Jan 3, 2025Updated last year
- Visualize expert firing frequencies across sentences in the Mixtral MoE model☆18Dec 22, 2023Updated 2 years ago
- ReportParse is a unified NLP analyzer for corporate sustainability reports☆21Sep 18, 2024Updated last year
- Word acquisition in neural language models (TACL 2022).☆20Jan 30, 2025Updated last year
- ☆19Apr 16, 2025Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- DCPO: Dynamic Adaptive Clipping for RL☆49Apr 1, 2026Updated last month
- Klimatkollen's data pipeline and API for processing company sustainability reports☆23Updated this week
- Crawl & Visualize NeurIPS 2022 Data from OpenReview☆14Nov 8, 2022Updated 3 years ago
- ☆31Jun 6, 2025Updated 11 months ago
- ☆14Dec 21, 2024Updated last year
- Marathon: A Multiple-choice Long Context Evaluation Benchmark for Large Language Models.☆10May 16, 2024Updated last year
- An Apache 2.0 fork of HuggingFace's Large Language Model Text Generation Inference☆19Mar 10, 2024Updated 2 years ago
- List of direct speech-to-speech translation papers.☆39Jan 31, 2023Updated 3 years ago
- [ICLR 2024 Spotlight] 🚀 The official repository of Self-Supervised Learning method "ROPIM", "Pre-training with Random Orthogonal Project…☆10Jan 15, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- TMMA: A Tiled Matrix Multiplication Accelerator for Self-Attention Projections in Transformer Models, optimized for edge deployment on Xi…☆31Apr 7, 2026Updated last month
- Flight connections map done with D3.js data visualization library.☆12Dec 5, 2019Updated 6 years ago
- 📥 🎯 (1,4/4) an MLIR-based toolchain with Vitis HLS LLVM input/output targeting FPGAs.☆15Nov 15, 2022Updated 3 years ago
- 定时爬取arXiv每日论文☆13May 22, 2023Updated 2 years ago
- Understanding Rare Spurious Correlations in Neural Network☆12Jun 5, 2022Updated 3 years ago
- c++ version of ViT☆12Nov 13, 2022Updated 3 years ago
- Locally Valid and Discriminative Prediction Intervals for Deep Learning Models☆13May 22, 2023Updated 2 years ago
- Binary Neural Network-based COVID-19 Face-Mask Wear and Positioning Predictor on Edge Devices☆12Jul 1, 2021Updated 4 years ago
- Standardizing environment infrastructure with Strands Agents — step, observe, reward.☆46Updated this week
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Includes the SVD-based approximation algorithms for compressing deep learning models and the FPGA accelerators exploiting such approximat…☆16Mar 3, 2023Updated 3 years ago
- Code for paper "Spider: Any-to-Many Multimodal LLM"☆15Apr 26, 2025Updated last year
- Chinese Guide for Alveo Getting Started☆12May 18, 2020Updated 5 years ago
- An optimized Merkle Patricia Trie implementation on GPU, fully compatible with and integrable into Ethereum. The paper is published on VL…☆14Apr 15, 2024Updated 2 years ago
- [DATE'2025, TCAD'2025] Terafly : A Multi-Node FPGA Based Accelerator Design for Efficient Cooperative Inference in LLMs☆36Nov 13, 2025Updated 5 months ago
- Assignments of CSCE-642: Deep Reinforcement Learning offered at Texas A&M University.☆10Aug 31, 2025Updated 8 months ago
- Learning Representations that Support Robust Transfer of Predictors☆20Nov 7, 2021Updated 4 years ago