[ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers
โ76Jun 23, 2025Updated 8 months ago
Alternatives and similar repositories for Monet
Users that are interested in Monet are comparing it to the libraries listed below
Sorting:
- ๐๏ธ 5th place solution in the Google American Sign Language Fingerspelling Recognition Competition๐๏ธโ16Sep 19, 2023Updated 2 years ago
- a Jax/Flax inference code of StarCoderโ12Jun 12, 2023Updated 2 years ago
- Jax/Flax implementation of DeiT and DeiT-III (ViT)โ19Dec 21, 2024Updated last year
- TPU์์ ํ๊ตญ์ด์ฉ LLM ์ถ๋ก ์ ์ํ Jax/Flax ๊ตฌํ์ฒด์ ๋๋ค.โ12Jun 12, 2023Updated 2 years ago
- ์ปค๋ฒ๋ฆฌ์คํธ - ๋ถ ์ปค๋ฒ ์์ฑ AI ์๋น์คโ13Sep 11, 2022Updated 3 years ago
- KWU Real-time Notice Notification App for Androidโ18May 10, 2024Updated last year
- [NAACL 2025] ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverageโ16Sep 2, 2025Updated 6 months ago
- Deploy KoGPT with Triton Inference Serverโ14Nov 18, 2022Updated 3 years ago
- KW ์๋ฆฌ๋ฏธ - ๊ด์ด๋ํ๊ต ๊ณต์ง์ฌํญ ์๋ฆผโ16Jul 23, 2022Updated 3 years ago
- ๐งชcategorical tabnet research part๐งชโ13Apr 12, 2024Updated last year
- โ13Dec 29, 2025Updated 2 months ago
- LLM์ ํ์ฉํ ๋ํํ ์ ์ฌ ํ๋ก ๊ฒ์ ์์คํ ์ ๋๋ค.โ28Jul 3, 2023Updated 2 years ago
- MishformerLens intends to be a drop-in replacement for TransformerLens that AST patches HuggingFace Transformers rather than implementingโฆโ10Oct 7, 2024Updated last year
- ๐ฅKNOW๊ธฐ๋ฐ ์ง์ ์ถ์ฒ ์๊ณ ๋ฆฌ์ฆ ๊ฒฝ์ง๋ํ 1๋ฑ ์๋ฃจ์ ์ ๋๋ค๐ฅโ44Feb 15, 2022Updated 4 years ago
- [ICLR 2025] ChroKnowledge: Unveiling Chronological Knowledge of Language Models in Multiple Domainsโ17Mar 4, 2025Updated last year
- A tiny easily hackable implementation of a feature dashboard.โ16Oct 21, 2025Updated 5 months ago
- Multi-Layer Sparse Autoencoders (ICLR 2025)โ29Feb 6, 2026Updated last month
- ๐ฅ LG-AI-Challenge 2022 1์ ์๋ฃจ์ ์ ๋๋ค.โ13Jun 6, 2023Updated 2 years ago
- ๐ฅSamsung AI Challenge 2021 1๋ฑ ์๋ฃจ์ ์ ๋๋ค๐ฅโ54Nov 12, 2021Updated 4 years ago
- Trains small LMs. Designed for training on SimpleStoriesโ12Sep 15, 2025Updated 6 months ago
- ๐ฅ Codalab-Microsoft-COCO-Image-Captioning-Challenge 3rd place solution(06.30.21)โ23Apr 6, 2022Updated 3 years ago
- Optimize RandAugment with differentiable operationsโ25Jan 25, 2021Updated 5 years ago
- Modified to support crosscoder training.โ25Feb 4, 2026Updated last month
- Official Code for What Makes and Breaks Safety Fine-tuning? A Mechanistic Study (NeurIPS 2024)โ12Oct 31, 2024Updated last year
- EMNLP 2022: Biomedical NER for the Enterprise with Distillated BERN2 and the Kazu Frameworkโ11Aug 29, 2024Updated last year
- ๐์ ์ฉ์นด๋ ์ฌ์ฉ์ ์ฐ์ฒด ์์ธก AI ๊ฒฝ์ง๋ํ 2๋ฑ ์๋ฃจ์ ๐โ13Dec 5, 2022Updated 3 years ago
- Tools for optimizing steering vectors in LLMs.โ20Apr 10, 2025Updated 11 months ago
- KernelBench v2: Can LLMs Write GPU Kernels? - Benchmark with Torch -> Triton (and more!) problemsโ22Jul 4, 2025Updated 8 months ago
- ๐ ํ ์ค NEXT ML CHALLENGE : ๊ด๊ณ ํด๋ฆญ ์์ธก(CTR) ๋ํ 5๋ฑ ๋ชจ๋ธ ์ ์ถ์ฉ ๋ ํฌ์งํ ๋ฆฌ๐โ26Feb 2, 2026Updated last month
- Approximating the joint distribution of language models via MCTSโ22Nov 3, 2024Updated last year
- Improving Steering Vectors by Targeting Sparse Autoencoder Featuresโ27Nov 20, 2024Updated last year
- Kotlin Multiplatform App for generating README with AIโ51Oct 25, 2022Updated 3 years ago
- โ58Nov 19, 2024Updated last year
- A library for training crosscodersโ16May 28, 2025Updated 9 months ago
- The official code repo and data hub of top_nsigma sampling strategy for LLMs.โ26Feb 11, 2025Updated last year
- โ29May 24, 2024Updated last year
- Learning from Negative samples for Biomedical Generative Entity Linkingโ17May 25, 2025Updated 9 months ago
- โ35Feb 20, 2025Updated last year
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?โ90Mar 18, 2025Updated last year