Wonderful Matrices to Build Small Language Models
☆44Feb 15, 2025Updated last year
Alternatives and similar repositories for WonderfulMatrices
Users that are interested in WonderfulMatrices are comparing it to the libraries listed below
Sorting:
- ☆16Jun 10, 2025Updated 9 months ago
- Doge Family of Small Language Models☆185Jan 6, 2026Updated 2 months ago
- [NeurIPS XAIA & Springer] Code and notebooks to paper "A Fresh Look at Sanity Checks for Saliency Maps"☆25Jul 12, 2024Updated last year
- 3D Traffic Light & Sign Dataset☆25Mar 24, 2025Updated 11 months ago
- This is a repository for paper titled, PlaSma: Making Small Language Models Better Procedural Knowledge Models for (Counterfactual) Plann…☆14Nov 3, 2023Updated 2 years ago
- [ACL'25 Oral] What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective☆76Jun 25, 2025Updated 8 months ago
- Lean implementation of various multi-agent LLM methods, including Iteration of Thought (IoT)☆128Feb 10, 2025Updated last year
- Official implementation of MetaTree: Learning a Decision Tree Algorithm with Transformers☆114Sep 13, 2024Updated last year
- (CVPR 2025) Official implementation to DELT: A Simple Diversity-driven EarlyLate Training for Dataset Distillation which outperforms SOTA…☆27Aug 23, 2025Updated 6 months ago
- Code for the EMNLP 2024 paper "Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps"☆147Oct 13, 2025Updated 5 months ago
- ☆88Jan 10, 2024Updated 2 years ago
- Generate Python Package with Simple Prompts☆75Nov 22, 2024Updated last year
- The official code of TAMP - Imaging foundation model for universal enhancement of non-ideal measurement CT☆24Dec 15, 2025Updated 3 months ago
- Synthetic Data Engine 💎☆73Mar 2, 2026Updated 2 weeks ago
- XmodelLM☆38Nov 19, 2024Updated last year
- Code associated with the EMNLP 2024 Main paper: "Image, tell me your story!" Predicting the original meta-context of visual misinformatio…☆45Dec 6, 2025Updated 3 months ago
- [NeurIPS VLM workshop 2024] In-Context Ensemble Learning from Pseudo Labels Improves Video-Language Models for Low-Level Workflow Underst…☆23Mar 16, 2025Updated last year
- Official PyTorch Implementation of Masked Temporal Interpolation Diffusion for Procedure Planning in Instructional Videos☆11Feb 10, 2026Updated last month
- ☆46Aug 25, 2024Updated last year
- Initialization using Update Approximation is a Silver Bullet for Extremely Efficient Low-Rank Fine-Tuning☆52Oct 17, 2025Updated 5 months ago
- ☆34Dec 13, 2025Updated 3 months ago
- Official implementation of Add-SD: Rational Generation without Manual Reference.☆28Aug 19, 2024Updated last year
- Emotion-Aware Dialogue Response Generation by Multi-Task Learning☆13Jan 22, 2022Updated 4 years ago
- (整合包Integrated package)一键使用面壁智能最新的MiniCPM-o 2.6多模态模型,用于视频对话、语音对话和文字对话。|Use Modelbest's latest MiniCPM-o 2.6 multi-modal model with one c…☆15Jul 13, 2025Updated 8 months ago
- (AAAI'25) Training-and-pormpt Free General Painterly Image Harmonization Using image-wise attention sharing☆61Dec 17, 2024Updated last year
- Official repo for EMNLP 2023 paper "Explain-then-Translate: An Analysis on Improving Program Translation with Self-generated Explanations…☆29Dec 5, 2023Updated 2 years ago
- Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).☆82Mar 18, 2024Updated 2 years ago
- CJOBS☆14Dec 22, 2018Updated 7 years ago
- ☆13Mar 4, 2026Updated 2 weeks ago
- ☆23Nov 26, 2024Updated last year
- The official repo for "GraMMaR: Ground-aware Motion Model for 3D Human Motion Reconstruction"☆29Mar 29, 2024Updated last year
- Federated Transformer (NeurIPS 24): a framework to enhance the performance of multi-party Vertical Federated Learning involving fuzzy ide…☆42Dec 14, 2024Updated last year
- Towards Medical Small Language Models with Self-Evolved \\ Slow Thinking☆87Nov 11, 2025Updated 4 months ago
- Montaging for microscopy imaging files☆30Mar 12, 2026Updated last week
- Adaptive Inter-Class Similarity Distillation for Semantic Segmentation (MTAP 2025)☆29Nov 14, 2025Updated 4 months ago
- Code for this paper "HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts via HyperNetwork"☆33Nov 29, 2023Updated 2 years ago
- [ECCV'24 Workshops Oral] DALDA: Data Augmentation Leveraging Diffusion Model and LLM with Adaptive Guidance Scaling☆31Feb 6, 2026Updated last month
- Unauthenticated enumeration of AWS IAM Roles.☆26Sep 7, 2025Updated 6 months ago
- Agent for DataFlow: Automatic Data Workflow Design☆56Mar 12, 2026Updated last week