[ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers
โ78Jun 23, 2025Updated 10 months ago
Alternatives and similar repositories for Monet
Users that are interested in Monet are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ๐๏ธ 5th place solution in the Google American Sign Language Fingerspelling Recognition Competition๐๏ธโ16Sep 19, 2023Updated 2 years ago
- ๊ด์ด๋ํ๊ต ์ปดํจํฐ ๋น์ AI ๊ฒฝ์ง๋ํ 1๋ฑ ์๋ฃจ์ ์ ๋๋ค.โ15Oct 5, 2022Updated 3 years ago
- ๐ฅ12th place solution on G2Net Detecting Continuous Gravitational Waves๐ฅโ14Jan 4, 2023Updated 3 years ago
- Jax/Flax implementation of DeiT and DeiT-III (ViT)โ19Dec 21, 2024Updated last year
- TPU์์ ํ๊ตญ์ด์ฉ LLM ์ถ๋ก ์ ์ํ Jax/Flax ๊ตฌํ์ฒด์ ๋๋ค.โ12Jun 12, 2023Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI โข AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ์ปค๋ฒ๋ฆฌ์คํธ - ๋ถ ์ปค๋ฒ ์์ฑ AI ์๋น์คโ13Sep 11, 2022Updated 3 years ago
- KWU Real-time Notice Notification App for Androidโ18May 10, 2024Updated last year
- [NAACL 2025] ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverageโ16Sep 2, 2025Updated 8 months ago
- Serving large language model with transformersโ13Oct 18, 2022Updated 3 years ago
- Generate README.md with GPT-3 few-shot learningโ26Oct 19, 2022Updated 3 years ago
- KW ์๋ฆฌ๋ฏธ - ๊ด์ด๋ํ๊ต ๊ณต์ง์ฌํญ ์๋ฆผโ16Jul 23, 2022Updated 3 years ago
- ๐งชcategorical tabnet research part๐งชโ13Apr 12, 2024Updated 2 years ago
- Inverse DALL-E for Optical Character Recognitionโ38Oct 14, 2022Updated 3 years ago
- LLM์ ํ์ฉํ ๋ํํ ์ ์ฌ ํ๋ก ๊ฒ์ ์์คํ ์ ๋๋ค.โ27Jul 3, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient โข AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Dataset and Evaluation Code for the K-QA Benchmark.โ18May 26, 2024Updated last year
- MishformerLens intends to be a drop-in replacement for TransformerLens that AST patches HuggingFace Transformers rather than implementingโฆโ10Oct 7, 2024Updated last year
- [ICLR 2025] ChroKnowledge: Unveiling Chronological Knowledge of Language Models in Multiple Domainsโ17Mar 4, 2025Updated last year
- A tiny easily hackable implementation of a feature dashboard.โ16Oct 21, 2025Updated 6 months ago
- ๐ฅ LG-AI-Challenge 2022 1์ ์๋ฃจ์ ์ ๋๋ค.โ13Jun 6, 2023Updated 2 years ago
- Multi-Layer Sparse Autoencoders (ICLR 2025)โ30Feb 6, 2026Updated 2 months ago
- [EMNLP 2024] CompAct: Compressing Retrieved Documents Actively for Question Answeringโ38Sep 20, 2024Updated last year
- ALREADYME.md for backend using Java with Spring Bootโ25Oct 25, 2022Updated 3 years ago
- Optimize RandAugment with differentiable operationsโ25Jan 25, 2021Updated 5 years ago
- Deploy on Railway without the complexity - Free Credits Offer โข AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Modified to support crosscoder training.โ27Feb 4, 2026Updated 2 months ago
- Official Code for What Makes and Breaks Safety Fine-tuning? A Mechanistic Study (NeurIPS 2024)โ12Oct 31, 2024Updated last year
- Trains small LMs. Designed for training on SimpleStoriesโ13Sep 15, 2025Updated 7 months ago
- ๐ฅ171st place in Google brain solution๐ฅโ10Jul 25, 2022Updated 3 years ago
- ๐์ ์ฉ์นด๋ ์ฌ์ฉ์ ์ฐ์ฒด ์์ธก AI ๊ฒฝ์ง๋ํ 2๋ฑ ์๋ฃจ์ ๐โ12Dec 5, 2022Updated 3 years ago
- Tools for optimizing steering vectors in LLMs.โ21Apr 10, 2025Updated last year
- KernelBench v2: Can LLMs Write GPU Kernels? - Benchmark with Torch -> Triton (and more!) problemsโ23Jul 4, 2025Updated 9 months ago
- ๐ ํ ์ค NEXT ML CHALLENGE : ๊ด๊ณ ํด๋ฆญ ์์ธก(CTR) ๋ํ 5๋ฑ ๋ชจ๋ธ ์ ์ถ์ฉ ๋ ํฌ์งํ ๋ฆฌ๐โ25Feb 2, 2026Updated 3 months ago
- Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models โฆโ253Updated this week
- Deploy on Railway without the complexity - Free Credits Offer โข AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Approximating the joint distribution of language models via MCTSโ22Nov 3, 2024Updated last year
- Kotlin Multiplatform App for generating README with AIโ50Oct 25, 2022Updated 3 years ago
- โ58Nov 19, 2024Updated last year
- The official code repo and data hub of top_nsigma sampling strategy for LLMs.โ26Feb 11, 2025Updated last year
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.โ19Jul 24, 2025Updated 9 months ago
- โ29May 24, 2024Updated last year
- Learning from Negative samples for Biomedical Generative Entity Linkingโ18May 25, 2025Updated 11 months ago