mlbio-epfl / joint-inferenceLinks
[ICLR 2025] Large (Vision) Language Models are Unsupervised In-Context Learners
☆22Updated 6 months ago
Alternatives and similar repositories for joint-inference
Users that are interested in joint-inference are comparing it to the libraries listed below
Sorting:
- Official Code for Paper: Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation☆130Updated 5 months ago
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆49Updated 7 months ago
- ☆19Updated 8 months ago
- Code for Heima☆58Updated 7 months ago
- ☆50Updated 10 months ago
- Official implementation of MAIA, A Multimodal Automated Interpretability Agent☆99Updated last month
- Geometric-Mean Policy Optimization☆95Updated last month
- Official PyTorch implementation and models for paper "Diffusion Beats Autoregressive in Data-Constrained Settings". We find diffusion mod…☆115Updated last month
- [ICCV 2025] Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.☆169Updated 2 months ago
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆145Updated 5 months ago
- The official github repo for "Diffusion Language Models are Super Data Learners".☆212Updated last month
- [ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework☆70Updated 6 months ago
- Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆111Updated 2 weeks ago
- Unofficial Implementation of Selective Attention Transformer☆18Updated last year
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆50Updated last year
- Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"☆55Updated 2 months ago
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆31Updated 7 months ago
- Remasking Discrete Diffusion Models with Inference-Time Scaling☆62Updated 9 months ago
- Defeating the Training-Inference Mismatch via FP16☆163Updated last month
- dParallel: Learnable Parallel Decoding for dLLMs☆49Updated 2 months ago
- ☆95Updated last week
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models☆56Updated 6 months ago
- [AAAI26] LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs☆48Updated last week
- ☆105Updated 6 months ago
- [EMNLP'25 Industry] Repo for "Z1: Efficient Test-time Scaling with Code"☆67Updated 8 months ago
- The code implementation of Symbolic-MoE☆45Updated 3 months ago
- Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Mode…☆116Updated last year
- ☆17Updated 11 months ago
- JudgeLRM: Large Reasoning Models as a Judge☆40Updated last week
- ☆48Updated 3 months ago