microsoft / Industrial-Foundation-Models
Dedicated to building industrial foundation models for universal data intelligence across industries.
☆22Updated last month
Related projects: ⓘ
- Code implementation of synthetic continued pretraining☆13Updated this week
- ☆24Updated 7 months ago
- Official repository for paper "GTA: A Benchmark for General Tool Agents"☆28Updated 2 months ago
- Making LLaVA Tiny via MoE-Knowledge Distillation☆21Updated 3 weeks ago
- Code for "Merging Text Transformers from Different Initializations"☆18Updated last month
- ☆60Updated 5 months ago
- Implementation of the model: "Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models" in PyTorch☆27Updated this week
- Pytorch implementation of HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models☆27Updated 5 months ago
- [ICLR'24 spotlight] Tool-Augmented Reward Modeling☆33Updated 6 months ago
- ☆37Updated last year
- Tree prompting: easy-to-use scikit-learn interface for improved prompting.☆27Updated 10 months ago
- Towards Understanding the Mixture-of-Experts Layer in Deep Learning☆19Updated 9 months ago
- Code repo for MathAgent☆13Updated 9 months ago
- Code for "The Expressive Power of Low-Rank Adaptation".☆17Updated 5 months ago
- Understanding the correlation between different LLM benchmarks☆27Updated 8 months ago
- Linear Attention Sequence Parallelism (LASP)☆64Updated 3 months ago
- ☆45Updated 7 months ago
- ☆12Updated 9 months ago
- ☆21Updated this week
- ☆26Updated last year
- Open-LLM-Leaderboard: Open-Style Question Evaluation. Paper at https://arxiv.org/abs/2406.07545☆28Updated 2 months ago
- Experiments for "A Closer Look at In-Context Learning under Distribution Shifts"☆20Updated last year
- Using FlexAttention to compute attention with different masking patterns☆28Updated last week
- ☆11Updated 11 months ago
- A Closer Look into Mixture-of-Experts in Large Language Models☆38Updated last month
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆29Updated 7 months ago
- We introduce EMMET and unify model editing with popular algorithms ROME and MEMIT.☆11Updated 3 weeks ago
- Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification☆11Updated last year
- Official implementation of "Gemini in Reasoning: Unveiling Commonsense in Multimodal Large Language Models"☆35Updated 8 months ago
- [ICML 2023] Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot Learning☆38Updated last year
- Minimum Description Length probing for neural network representations☆15Updated 11 months ago