[ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers
β75Jun 23, 2025Updated 8 months ago
Alternatives and similar repositories for Monet
Users that are interested in Monet are comparing it to the libraries listed below
Sorting:
- ποΈ 5th place solution in the Google American Sign Language Fingerspelling Recognition CompetitionποΈβ16Sep 19, 2023Updated 2 years ago
- a Jax/Flax inference code of StarCoderβ12Jun 12, 2023Updated 2 years ago
- κ΄μ΄λνκ΅ μ»΄ν¨ν° λΉμ AI κ²½μ§λν 1λ± μ루μ μ λλ€.β15Oct 5, 2022Updated 3 years ago
- π₯12th place solution on G2Net Detecting Continuous Gravitational Wavesπ₯β14Jan 4, 2023Updated 3 years ago
- Jax/Flax implementation of DeiT and DeiT-III (ViT)β19Dec 21, 2024Updated last year
- TPUμμ νκ΅μ΄μ© LLM μΆλ‘ μ μν Jax/Flax ꡬν체μ λλ€.β12Jun 12, 2023Updated 2 years ago
- 컀λ²λ¦¬μ€νΈ - λΆ μ»€λ² μμ± AI μλΉμ€β13Sep 11, 2022Updated 3 years ago
- Serving large language model with transformersβ13Oct 18, 2022Updated 3 years ago
- KWU Real-time Notice Notification App for Androidβ18May 10, 2024Updated last year
- Deploy KoGPT with Triton Inference Serverβ14Nov 18, 2022Updated 3 years ago
- Generate README.md with GPT-3 few-shot learningβ27Oct 19, 2022Updated 3 years ago
- β13Dec 29, 2025Updated 2 months ago
- π§ͺcategorical tabnet research partπ§ͺβ13Apr 12, 2024Updated last year
- A tiny easily hackable implementation of a feature dashboard.β15Oct 21, 2025Updated 4 months ago
- KW μ리미 - κ΄μ΄λνκ΅ κ³΅μ§μ¬ν μλ¦Όβ16Jul 23, 2022Updated 3 years ago
- [NAACL 2025] ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverageβ16Sep 2, 2025Updated 6 months ago
- π₯KNOWκΈ°λ° μ§μ μΆμ² μκ³ λ¦¬μ¦ κ²½μ§λν 1λ± μ루μ μ λλ€π₯β44Feb 15, 2022Updated 4 years ago
- LLMμ νμ©ν λνν μ μ¬ νλ‘ κ²μ μμ€ν μ λλ€.β27Jul 3, 2023Updated 2 years ago
- Multi-Layer Sparse Autoencoders (ICLR 2025)β29Feb 6, 2026Updated 3 weeks ago
- π₯ LG-AI-Challenge 2022 1μ μ루μ μ λλ€.β13Jun 6, 2023Updated 2 years ago
- πλ°μ΄μ½ AIν΄μ»€ν€ λν μ°μμ μ루μ πβ22Mar 13, 2024Updated last year
- Inverse DALL-E for Optical Character Recognitionβ38Oct 14, 2022Updated 3 years ago
- MishformerLens intends to be a drop-in replacement for TransformerLens that AST patches HuggingFace Transformers rather than implementingβ¦β10Oct 7, 2024Updated last year
- Learning to Skip the Middle Layers of Transformersβ17Aug 7, 2025Updated 6 months ago
- Official Code for What Makes and Breaks Safety Fine-tuning? A Mechanistic Study (NeurIPS 2024)β12Oct 31, 2024Updated last year
- π₯171st place in Google brain solutionπ₯β10Jul 25, 2022Updated 3 years ago
- πμ μ©μΉ΄λ μ¬μ©μ μ°μ²΄ μμΈ‘ AI κ²½μ§λν 2λ± μ루μ πβ13Dec 5, 2022Updated 3 years ago
- Tools for optimizing steering vectors in LLMs.β20Apr 10, 2025Updated 10 months ago
- ALREADYME.md for backend using Java with Spring Bootβ25Oct 25, 2022Updated 3 years ago
- π₯ Codalab-Microsoft-COCO-Image-Captioning-Challenge 3rd place solution(06.30.21)β23Apr 6, 2022Updated 3 years ago
- KernelBench v2: Can LLMs Write GPU Kernels? - Benchmark with Torch -> Triton (and more!) problemsβ21Jul 4, 2025Updated 7 months ago
- π₯Samsung AI Challenge 2021 1λ± μ루μ μ λλ€π₯β54Nov 12, 2021Updated 4 years ago
- Dataset and Evaluation Code for the K-QA Benchmark.β18May 26, 2024Updated last year
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.β18Jul 24, 2025Updated 7 months ago
- Code for reproducing our paper "Low Rank Adapting Models for Sparse Autoencoder Features"β17Mar 31, 2025Updated 11 months ago
- β19Mar 25, 2025Updated 11 months ago
- Optimize RandAugment with differentiable operationsβ25Jan 25, 2021Updated 5 years ago
- β58Nov 19, 2024Updated last year
- Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models β¦β243Feb 23, 2026Updated last week