(ICLR 2026) Unveiling Super Experts in Mixture-of-Experts Large Language Models
☆42Sep 25, 2025Updated 8 months ago
Alternatives and similar repositories for Super-Experts-Profilling
Users that are interested in Super-Experts-Profilling are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The code implementation for TTCS: Test-Time Curriculum Synthesis for Self-Evolving.☆50Apr 22, 2026Updated last month
- Security-native LLM system for AI-generated application security.☆253Jun 4, 2026Updated last week
- What do CLIP Vision Transformers learn? Feature Visualization can show you!☆15Aug 29, 2024Updated last year
- Martingale posterior neural networks for fast sequential decision making @ Neurips 2025☆25Nov 13, 2025Updated 7 months ago
- DiTASK: Multi-Task Fine-Tuning with Diffeomorphic Transformations (CVPR 2025)☆14Jun 1, 2025Updated last year
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆16Sep 4, 2025Updated 9 months ago
- [ICLR 2025] FLAT: LLM Unlearning via Loss Adjustment with Only Forget Data☆14Feb 26, 2025Updated last year
- ☆16Apr 21, 2025Updated last year
- ☆26Jan 5, 2026Updated 5 months ago
- A Model Agnostic function to directly remove specified layers from the LLM☆10May 23, 2024Updated 2 years ago
- Official implementation of paper "Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models"☆72Apr 4, 2026Updated 2 months ago
- Implement of Implicit Knowledge Extraction Attack.☆23Apr 17, 2026Updated last month
- [NeurIPS 2025] Official Implementation for "Glocal Information Bottleneck for Time Series Imputation"☆16Nov 4, 2025Updated 7 months ago
- An official implementation of "Catastrophic Failure of LLM Unlearning via Quantization" (ICLR 2025)☆38Feb 22, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆27Oct 27, 2025Updated 7 months ago
- [ICLR 2025] Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization☆24Oct 5, 2025Updated 8 months ago
- Focused Papers, Delivered Simply :)☆55Dec 25, 2025Updated 5 months ago
- A framework for steering MoE models by detecting and controlling behavior-linked experts.☆35Sep 12, 2025Updated 9 months ago
- self-adaptive in-context learning☆45May 5, 2023Updated 3 years ago
- ChineseHarm-Bench: A Chinese Harmful Content Detection Benchmark☆58Sep 2, 2025Updated 9 months ago
- LLM KV Cache compression - K+V dual compression, 73-99% VRAM savings, zero accuracy loss☆57Mar 30, 2026Updated 2 months ago
- ☆31Mar 16, 2025Updated last year
- This is a Scrapy-based web-spider. It scrapes papers from TOP conferences and journals.☆64Apr 5, 2026Updated 2 months ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- ☆13Mar 5, 2024Updated 2 years ago
- [ACL 2025] Adaptive Retrieval without Self-Knowledge? Bringing Uncertainty Back Home☆19May 17, 2025Updated last year
- Universal preflight security scanner for AI coding agents — Detects hooks injection, credential exfiltration & backdoors in .cursorrules,…☆72May 29, 2026Updated 2 weeks ago
- This repository provides the official implementation of QSVD, a method for efficient low-rank approximation that unifies Query-Key-Value …☆27May 16, 2026Updated 3 weeks ago
- A new heuristic to optimize implementations of linear matrices☆20Jan 2, 2023Updated 3 years ago
- 手搓Llama,个人学习用☆16May 21, 2024Updated 2 years ago
- ☆16May 16, 2025Updated last year
- An interactive attention visualization and intervention tool for LLM Decode Stage.☆48Jan 6, 2026Updated 5 months ago
- Skill-Inject: Measuring Agent Vulnerability to Skill File Attacks☆78May 7, 2026Updated last month
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Official repository of "Distort, Distract, Decode: Instruction-Tuned Model Can Refine its Response from Noisy Instructions", ICLR 2024 Sp…☆21Mar 7, 2024Updated 2 years ago
- Large Language Models in Molecular Embeddings☆12May 1, 2024Updated 2 years ago
- Self implementation of course projects for Computer Architecture 2022 Spring☆11Sep 17, 2022Updated 3 years ago
- ☆47Jun 24, 2025Updated 11 months ago
- ☆15Jun 14, 2022Updated 4 years ago
- LLM training parallelisms (DP, FSDP, TP, PP) in pure C☆29Jan 27, 2026Updated 4 months ago
- A lightweight Inference Engine built for block diffusion models☆46Apr 12, 2026Updated 2 months ago