(ICLR 2026) Unveiling Super Experts in Mixture-of-Experts Large Language Models
☆39Sep 25, 2025Updated 6 months ago
Alternatives and similar repositories for Super-Experts-Profilling
Users that are interested in Super-Experts-Profilling are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The code implementation for TTCS: Test-Time Curriculum Synthesis for Self-Evolving.☆41Apr 8, 2026Updated last week
- [AAAI 2026] This is the official implementation of the paper "ExtendAttack: Attacking Servers of LRMs via Extending Reasoning".☆22Mar 18, 2026Updated 3 weeks ago
- Martingale posterior neural networks for fast sequential decision making @ Neurips 2025☆23Nov 13, 2025Updated 5 months ago
- ☆16Sep 4, 2025Updated 7 months ago
- arXiv? No. ChineseXiv.☆115Mar 24, 2026Updated 3 weeks ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Code example for pretraining an LLM with vanilla PyTorch training loop☆10Jun 6, 2024Updated last year
- A framework for steering MoE models by detecting and controlling behavior-linked experts.☆32Sep 12, 2025Updated 7 months ago
- Official implementation of paper "Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models"☆68Apr 4, 2026Updated last week
- A Model Agnostic function to directly remove specified layers from the LLM☆10May 23, 2024Updated last year
- [NeurIPS 2025] Official Implementation for "Glocal Information Bottleneck for Time Series Imputation"☆15Nov 4, 2025Updated 5 months ago
- EANN(Pytorch)☆10Mar 12, 2022Updated 4 years ago
- This is a C++ implementation of cocoapi bbox evaluation code.☆11Dec 9, 2021Updated 4 years ago
- An official implementation of "Catastrophic Failure of LLM Unlearning via Quantization" (ICLR 2025)☆38Feb 22, 2025Updated last year
- [ICLR 2025] Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization☆24Oct 5, 2025Updated 6 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆26Oct 27, 2025Updated 5 months ago
- self-adaptive in-context learning☆45May 5, 2023Updated 2 years ago
- ☆12Dec 13, 2022Updated 3 years ago
- EMNLP 2025 | RouterLens☆29Sep 15, 2025Updated 7 months ago
- This repository provides the official implementation of QSVD, a method for efficient low-rank approximation that unifies Query-Key-Value …☆26Dec 1, 2025Updated 4 months ago
- ☆27Apr 14, 2025Updated last year
- Skill-Inject: Measuring Agent Vulnerability to Skill File Attacks☆49Apr 8, 2026Updated last week
- This is a Scrapy-based web-spider. It scrapes papers from TOP conferences and journals.☆61Apr 5, 2026Updated last week
- [ACL 2025] Adaptive Retrieval without Self-Knowledge? Bringing Uncertainty Back Home☆18May 17, 2025Updated 10 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- 😎 All your need for future is FollowGPT.☆13Nov 8, 2023Updated 2 years ago
- ☆44Oct 12, 2025Updated 6 months ago
- A new heuristic to optimize implementations of linear matrices☆19Jan 2, 2023Updated 3 years ago
- ☆16May 16, 2025Updated 10 months ago
- ☆11Jun 11, 2021Updated 4 years ago
- Official repository of "Distort, Distract, Decode: Instruction-Tuned Model Can Refine its Response from Noisy Instructions", ICLR 2024 Sp…☆21Mar 7, 2024Updated 2 years ago
- DRAM/SSD hybrid caching system☆15Mar 13, 2025Updated last year
- Self implementation of course projects for Computer Architecture 2022 Spring☆11Sep 17, 2022Updated 3 years ago
- this repo is mnbvc text quality classification using fastText☆16Oct 2, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆46Jun 24, 2025Updated 9 months ago
- LLM training parallelisms (DP, FSDP, TP, PP) in pure C☆28Jan 27, 2026Updated 2 months ago
- A record of reading list on some MLsys popular topic☆23Mar 20, 2025Updated last year
- A lightweight Inference Engine built for block diffusion models☆43Updated this week
- RISC-V SingleCycle/Pipeline CPU (lab of ZJU Computer System Series)☆16Jul 6, 2023Updated 2 years ago
- ☆33Oct 13, 2025Updated 6 months ago
- ☆13Oct 8, 2021Updated 4 years ago