☆31Dec 31, 2025Updated 3 months ago
Alternatives and similar repositories for hybrid-distillation
Users that are interested in hybrid-distillation are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆241Nov 19, 2025Updated 4 months ago
- ☆58Jul 9, 2024Updated last year
- ☆70Jul 8, 2025Updated 9 months ago
- Stick-breaking attention☆63Jul 1, 2025Updated 9 months ago
- ☆134Jun 6, 2025Updated 10 months ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Expanding linear RNN state-transition matrix eigenvalues to include negatives improves state-tracking tasks and language modeling without…☆21Mar 15, 2025Updated last year
- Code and data for paper "(How) do Language Models Track State?"☆22Mar 31, 2025Updated last year
- Code for paper "ElasticTrainer: Speeding Up On-Device Training with Runtime Elastic Tensor Selection" (MobiSys'23)☆14Nov 1, 2023Updated 2 years ago
- ☆13Nov 3, 2024Updated last year
- Class materials, homeworks and videos for probation preparation.☆23Feb 3, 2026Updated 2 months ago
- ☆12Jan 29, 2021Updated 5 years ago
- code for paper "Accessing higher dimensions for unsupervised word translation"☆22Jun 26, 2023Updated 2 years ago
- ☆45Nov 1, 2025Updated 5 months ago
- Source code and dataset for the paper 'Saamayik: A Benchmark and Dataset for English-Sanskrit Translation'☆15Oct 11, 2025Updated 6 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Cross-lingual learning in scene text recognition (ICASSP2024)☆18Sep 29, 2024Updated last year
- This is the official implementation for paper "On Powerful Ways to Generate: Autoregression, Diffusion, and Beyond".☆20Nov 17, 2025Updated 5 months ago
- Engine for collecting, uploading, and downloading model activations☆28Apr 2, 2025Updated last year
- Linear Attention Sequence Parallelism (LASP)☆88Jun 4, 2024Updated last year
- Efficient retrieval head analysis with triton flash attention that supports topK probability☆13Jun 15, 2024Updated last year
- Reproducing R1 for Code with Reliable Rewards☆12Apr 9, 2025Updated last year
- ☆13Feb 1, 2024Updated 2 years ago
- AES - Ancient Egyptian Sentences; Corpus of Ancient Egyptian sentences for corpus-linguistic research☆10May 18, 2021Updated 4 years ago
- Official Repo for Error-Free Linear Attention is a Free Lunch: Exact Solution from Continuous-Time Dynamics☆72Mar 26, 2026Updated 3 weeks ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆48Jun 16, 2025Updated 10 months ago
- A public dataset containing chord/beat annotation from a music game named 'osu!'.☆11Oct 17, 2017Updated 8 years ago
- AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidence☆10Mar 2, 2025Updated last year
- M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models☆46Jul 17, 2025Updated 9 months ago
- An Empirical Comparison of Unsupervised Constituency Parsing Methods☆14Aug 15, 2021Updated 4 years ago
- [CVPR 2026] Official repo for "VideoSSR: Video Self-Supervised Reinforcement Learning"☆35Nov 11, 2025Updated 5 months ago
- ☆14Jul 13, 2025Updated 9 months ago
- Code for "AtTGen: Attribute Tree Generation for Real-World Attribute Joint Extraction", ACL 2023☆13May 19, 2023Updated 2 years ago
- JsonTuning: Towards Generalizable, Robust, and Controllable Instruction Tuning☆10Nov 3, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Fully open reproduction of DeepSeek-R1☆11Mar 24, 2025Updated last year
- Cairo lua bindings with extensions for torch☆15Jun 12, 2016Updated 9 years ago
- ☆14Dec 25, 2024Updated last year
- Repository for the deep-learning framework DIVA-DAF which is build with historical document image analysis in mind.☆18Nov 7, 2024Updated last year
- "Learning Rhyming Constraints using Structured Adversaries. Jhamtani H., Mehta S., Carbonell J., Berg-Kirkpatrick T. EMNLP-IJCNLP (Short …☆11Mar 17, 2020Updated 6 years ago
- InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models☆103Feb 2, 2026Updated 2 months ago
- [ICLR 2025] No Preference Left Behind: Group Distributional Preference Optimization☆15Apr 21, 2025Updated 11 months ago