Study group / research-padawan community for the misfits
☆34Oct 15, 2025Updated 8 months ago
Alternatives and similar repositories for ml_misfits
Users that are interested in ml_misfits are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This code is used to populate the "ODS jobs dump" Telegram bot, and it can be used for any other dumped Slack channel☆14Sep 12, 2022Updated 3 years ago
- Skoltech NLA 2024 course.☆39Dec 10, 2024Updated last year
- The official implementation of HybridNorm: Towards Stable and Efficient Transformer Training via Hybrid Normalization☆19Mar 7, 2025Updated last year
- ☆14Jul 13, 2025Updated 11 months ago
- Official Pytorch implementation of Chromatic Graph Transformers☆10Jun 14, 2023Updated 3 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Clustered Compositional Embeddings☆13Oct 25, 2023Updated 2 years ago
- codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"☆11Dec 30, 2024Updated last year
- Don't just regulate gradients like in Muon, regulate the weights too☆32Jul 30, 2025Updated 10 months ago
- Unofficial Scalable-Softmax Is Superior for Attention☆20May 30, 2025Updated last year
- Application to generate an RSS feed from your GitHub notifications.☆13Dec 8, 2022Updated 3 years ago
- ☆12Jan 17, 2024Updated 2 years ago
- Find context neurons in Pythia models.☆13Jun 13, 2023Updated 3 years ago
- Open Statistics and Probability Theory course☆22Aug 31, 2025Updated 9 months ago
- Least Squares Regression for subspace clustering☆11May 27, 2018Updated 8 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆12Mar 19, 2021Updated 5 years ago
- Personal solutions to the Triton Puzzles☆21Jul 18, 2024Updated last year
- ☆17Jun 28, 2025Updated 11 months ago
- Codes for the paper The emergence of clusters in self-attention dynamics.☆18Dec 18, 2023Updated 2 years ago
- PyTorch implementation of "Towards k-means-friendly spaces: Simultaneous deep learning and clustering," Bo Yang et al., 2017.☆17Jan 15, 2021Updated 5 years ago
- Code for PAC-Bayes Compression Bounds So Tight That They Can Explain Generalization, NeurIPS 2022☆18Nov 23, 2022Updated 3 years ago
- Code for verifying deep neural feature ansatz☆22May 3, 2023Updated 3 years ago
- ☆13May 21, 2024Updated 2 years ago
- Several common methods of matrix multiplication are implemented on CPU and Nvidia GPU using C++11 and CUDA.☆14Feb 8, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- 1.2% test error on MNIST using only least squares and numpy calls.☆22Sep 13, 2023Updated 2 years ago
- u-MPS implementation and experimentation code used in the paper Tensor Networks for Probabilistic Sequence Modeling (https://arxiv.org/ab…☆19Jul 2, 2020Updated 5 years ago
- possibly useful materials for learning RWKV language model.☆26Jun 8, 2023Updated 3 years ago
- Official repository for our paper, Transformers Learn Higher-Order Optimization Methods for In-Context Learning: A Study with Linear Mode…☆20Nov 19, 2024Updated last year
- Linear Algebra Course being taught in HSE in 2020/2021 (in russian)☆32Apr 25, 2022Updated 4 years ago
- Schema-based HTTP client powered by axios. Written in Typescript. Heavily inspired by AngularJS' $resource.☆17Feb 17, 2025Updated last year
- Official Repository for ICML 2023 paper "Can Neural Network Memorization Be Localized?"☆21Oct 26, 2023Updated 2 years ago
- Node.js utility to sync files to Amazon S3 and invalidate CloudFront distributions.☆14Oct 4, 2022Updated 3 years ago
- Distributed Optimization: Analysis and Synthesis via Circuits☆22Mar 10, 2026Updated 3 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Materials for my talks☆15Oct 30, 2021Updated 4 years ago
- Pytorch code for experiments on Linear Transformers☆24Jan 12, 2024Updated 2 years ago
- Experiments on the impact of depth in transformers and SSMs.☆41Oct 23, 2025Updated 7 months ago
- ☆54Nov 16, 2023Updated 2 years ago
- ☆32Apr 21, 2024Updated 2 years ago
- Flash Attention in 300-500 lines of CUDA/C++☆37Aug 22, 2025Updated 9 months ago
- PyTorch implementation for Neural Additive Models☆25Dec 2, 2020Updated 5 years ago