[ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers
β78Jun 23, 2025Updated 9 months ago
Alternatives and similar repositories for Monet
Users that are interested in Monet are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ποΈ 5th place solution in the Google American Sign Language Fingerspelling Recognition CompetitionποΈβ16Sep 19, 2023Updated 2 years ago
- Jax/Flax implementation of DeiT and DeiT-III (ViT)β19Dec 21, 2024Updated last year
- TPUμμ νκ΅μ΄μ© LLM μΆλ‘ μ μν Jax/Flax ꡬν체μ λλ€.β12Jun 12, 2023Updated 2 years ago
- 컀λ²λ¦¬μ€νΈ - λΆ μ»€λ² μμ± AI μλΉμ€β13Sep 11, 2022Updated 3 years ago
- KWU Real-time Notice Notification App for Androidβ18May 10, 2024Updated last year
- End-to-end encrypted email - Proton Mail β’ AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- [NAACL 2025] ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverageβ16Sep 2, 2025Updated 7 months ago
- Serving large language model with transformersβ13Oct 18, 2022Updated 3 years ago
- Deploy KoGPT with Triton Inference Serverβ14Nov 18, 2022Updated 3 years ago
- Generate README.md with GPT-3 few-shot learningβ26Oct 19, 2022Updated 3 years ago
- KW μ리미 - κ΄μ΄λνκ΅ κ³΅μ§μ¬ν μλ¦Όβ16Jul 23, 2022Updated 3 years ago
- β13Apr 1, 2026Updated last week
- Inverse DALL-E for Optical Character Recognitionβ38Oct 14, 2022Updated 3 years ago
- π₯KNOWκΈ°λ° μ§μ μΆμ² μκ³ λ¦¬μ¦ κ²½μ§λν 1λ± μ루μ μ λλ€π₯β44Feb 15, 2022Updated 4 years ago
- [ICLR 2025] ChroKnowledge: Unveiling Chronological Knowledge of Language Models in Multiple Domainsβ17Mar 4, 2025Updated last year
- Managed Database hosting by DigitalOcean β’ AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A tiny easily hackable implementation of a feature dashboard.β16Oct 21, 2025Updated 5 months ago
- πλ°μ΄μ½ AIν΄μ»€ν€ λν μ°μμ μ루μ πβ22Mar 13, 2024Updated 2 years ago
- Multi-Layer Sparse Autoencoders (ICLR 2025)β29Feb 6, 2026Updated 2 months ago
- π₯ LG-AI-Challenge 2022 1μ μ루μ μ λλ€.β13Jun 6, 2023Updated 2 years ago
- [EMNLP 2024] CompAct: Compressing Retrieved Documents Actively for Question Answeringβ39Sep 20, 2024Updated last year
- Repository with sample code using Apollo's suggested engineering practicesβ15Dec 16, 2024Updated last year
- ALREADYME.md for backend using Java with Spring Bootβ25Oct 25, 2022Updated 3 years ago
- Trains small LMs. Designed for training on SimpleStoriesβ12Sep 15, 2025Updated 6 months ago
- π₯ Codalab-Microsoft-COCO-Image-Captioning-Challenge 3rd place solution(06.30.21)β23Apr 6, 2022Updated 4 years ago
- Proton VPN Special Offer - Get 70% off β’ AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Optimize RandAugment with differentiable operationsβ25Jan 25, 2021Updated 5 years ago
- Modified to support crosscoder training.β26Feb 4, 2026Updated 2 months ago
- Official Code for What Makes and Breaks Safety Fine-tuning? A Mechanistic Study (NeurIPS 2024)β12Oct 31, 2024Updated last year
- EMNLP 2022: Biomedical NER for the Enterprise with Distillated BERN2 and the Kazu Frameworkβ11Aug 29, 2024Updated last year
- π₯171st place in Google brain solutionπ₯β10Jul 25, 2022Updated 3 years ago
- πμ μ©μΉ΄λ μ¬μ©μ μ°μ²΄ μμΈ‘ AI κ²½μ§λν 2λ± μ루μ πβ13Dec 5, 2022Updated 3 years ago
- Tools for optimizing steering vectors in LLMs.β21Apr 10, 2025Updated last year
- KernelBench v2: Can LLMs Write GPU Kernels? - Benchmark with Torch -> Triton (and more!) problemsβ23Jul 4, 2025Updated 9 months ago
- Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models β¦β248Updated this week
- Proton VPN Special Offer - Get 70% off β’ AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- π ν μ€ NEXT ML CHALLENGE : κ΄κ³ ν΄λ¦ μμΈ‘(CTR) λν 5λ± λͺ¨λΈ μ μΆμ© λ ν¬μ§ν 리πβ26Feb 2, 2026Updated 2 months ago
- Approximating the joint distribution of language models via MCTSβ22Nov 3, 2024Updated last year
- Kotlin Multiplatform App for generating README with AIβ50Oct 25, 2022Updated 3 years ago
- Improving Steering Vectors by Targeting Sparse Autoencoder Featuresβ27Nov 20, 2024Updated last year
- The official code repo and data hub of top_nsigma sampling strategy for LLMs.β26Feb 11, 2025Updated last year
- A library for training crosscodersβ16May 28, 2025Updated 10 months ago
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.β19Jul 24, 2025Updated 8 months ago