☆56Sep 17, 2025Updated 5 months ago
Alternatives and similar repositories for openCLT
Users that are interested in openCLT are comparing it to the libraries listed below
Sorting:
- 6,080-param transformer achieving 100% accuracy on 10-digit addition. Trained from scratch in 10 minutes.☆21Feb 19, 2026Updated last week
- ☆29Nov 30, 2025Updated 3 months ago
- ☆25Feb 20, 2026Updated last week
- The Full Spectrum of Deepnet Hessians at Scale: Dynamics with SGD Training and Sample Size☆19May 19, 2019Updated 6 years ago
- ☆46Jul 21, 2025Updated 7 months ago
- Benchmarking Optimizers for LLM Pretraining☆52Dec 30, 2025Updated 2 months ago
- ☆25Jun 16, 2024Updated last year
- ☆35Jul 5, 2023Updated 2 years ago
- The code for creating the iGSM datasets in papers "Physics of Language Models Part 2.1, Grade-School Math and the Hidden Reasoning Proces…☆84Jan 12, 2025Updated last year
- Code for the paper "Distinguishing the Knowable from the Unknowable with Language Models"☆11Apr 15, 2024Updated last year
- ACL 2023 *oral* paper "MGR: Multi-generator based Rationalization"☆10Nov 21, 2024Updated last year
- Code for "What really matters in matrix-whitening optimizers?"☆21Oct 31, 2025Updated 4 months ago
- This is a repository for RM2021 Software tutorial☆11Nov 4, 2020Updated 5 years ago
- Pytorch routines for (Ker)nel (Mac)hines☆10Oct 10, 2025Updated 4 months ago
- Simple MoE - Day 17 of 365 Days of Repos☆16Jan 17, 2025Updated last year
- codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"☆10Dec 30, 2024Updated last year
- Code repository for the paper on "Predicting the Performance of Black-Box LLMs through Self-Queries".☆12Jan 9, 2025Updated last year
- [ACL 2025] Official implementation of the "CoT-ICL Lab" framework☆11Oct 10, 2025Updated 4 months ago
- CopyBench: Measuring Literal and Non-Literal Reproduction of Copyright-Protected Text in Language Model Generation☆14Aug 19, 2025Updated 6 months ago
- code for EMNLP 2024 paper: How do Large Language Models Learn In-Context? Query and Key Matrices of In-Context Heads are Two Towers for M…☆13Nov 17, 2024Updated last year
- code for EMNLP 2024 paper: Interpreting Arithmetic Mechanism in Large Language Models through Comparative Neuron Analysis☆12Nov 17, 2024Updated last year
- ☆12Feb 28, 2025Updated last year
- [ICLR 2025] This repository contains the code to reproduce the results from our paper From Sparse Dependence to Sparse Attention: Unveili…☆12Mar 7, 2025Updated 11 months ago
- ☆10Mar 4, 2024Updated last year
- Control LLM generation format efficiently. A simple version of microsoft/aici in vllm and transformers☆14Jun 7, 2024Updated last year
- Tools for optimizing steering vectors in LLMs.☆20Apr 10, 2025Updated 10 months ago
- ☆11Jun 20, 2023Updated 2 years ago
- CUDA implementation of Multidimensional Scaling☆15May 8, 2021Updated 4 years ago
- C++-Animation-(Standard-Template-Library)-Engine,or CASTLE for short,is a C++ plotting and animation engine created by BiliBili uploader …☆11Jan 17, 2021Updated 5 years ago
- This is the notebooks for videos in my Bilibili Channel (https://space.bilibili.com/32773300?spm_id_from=333.1007.0.0)☆30Nov 6, 2025Updated 3 months ago
- Code for ICLR 2023 Harnessing Out-Of-Distribution Examples via Augmenting Content and Style☆13Jul 3, 2023Updated 2 years ago
- The Infibench variant of bigcode-evaluation-harness --- a framework for the evaluation of autoregressive code generation language models.☆14Oct 19, 2024Updated last year
- A resource repository for representation engineering in large language models☆148Nov 14, 2024Updated last year
- 【IEEE TPAMI 2025】Uncertainty-aware Medical Diagnostic Phrase Identification and Grounding☆30Jan 20, 2026Updated last month
- ☆55Jan 28, 2026Updated last month
- A python package for text sanitization with differential privacy☆39Dec 25, 2025Updated 2 months ago
- ☆17Feb 4, 2025Updated last year
- Reproducing GPT on the TinyStories dataset☆19Jan 18, 2024Updated 2 years ago
- Computing the greatest common divisor with transformers, source code for the paper https//arxiv.org/abs/2308.15594☆14Aug 11, 2025Updated 6 months ago