tum-ai / number-token-lossLinks
A regression-alike loss to improve numerical reasoning in language models
☆24Updated 2 weeks ago
Alternatives and similar repositories for number-token-loss
Users that are interested in number-token-loss are comparing it to the libraries listed below
Sorting:
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆47Updated 2 months ago
- Implementation of the general framework for AMIE, from the paper "Towards Conversational Diagnostic AI", out of Google Deepmind☆66Updated 10 months ago
- Towards Understanding the Mixture-of-Experts Layer in Deep Learning☆31Updated last year
- m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning in Large Language Models☆39Updated 3 months ago
- Evaluation and dataset construction code for the CVPR 2025 paper "Vision-Language Models Do Not Understand Negation"☆27Updated 3 months ago
- [Technical Report] Official PyTorch implementation code for realizing the technical part of Phantom of Latent representing equipped with …☆60Updated 9 months ago
- Holistic evaluation of multimodal foundation models☆48Updated 11 months ago
- A collection of AWESOME language modeling techniques on tabular data applications.☆32Updated 9 months ago
- [ACL'25] Mosaic-IT: Cost-Free Compositional Data Synthesis for Instruction Tuning☆19Updated last month
- ☆142Updated last year
- Experiments for "A Closer Look at In-Context Learning under Distribution Shifts"☆19Updated 2 years ago
- MultiModN – Multimodal, Multi-Task, Interpretable Modular Networks (NeurIPS 2023)☆33Updated last year
- PyTorch implementation of "From Sparse to Soft Mixtures of Experts"☆59Updated last year
- Implementation of TableFormer, Robust Transformer Modeling for Table-Text Encoding, in Pytorch☆39Updated 3 years ago
- ☆48Updated 5 months ago
- PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)☆75Updated last year
- ☆51Updated 6 months ago
- Recycling diverse models☆45Updated 2 years ago
- Implementation of Infini-Transformer in Pytorch☆110Updated 7 months ago
- ☆27Updated last year
- Official Code for Paper: Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation☆116Updated last month
- ☆28Updated 5 months ago
- Code and benchmark for the paper: "A Practitioner's Guide to Continual Multimodal Pretraining" [NeurIPS'24]☆57Updated 7 months ago
- Model Stock: All we need is just a few fine-tuned models☆119Updated 10 months ago
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models☆51Updated 2 months ago
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆42Updated 9 months ago
- ☆26Updated last year
- Official repository of "LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging"☆30Updated 9 months ago
- Tree prompting: easy-to-use scikit-learn interface for improved prompting.☆39Updated last year
- [NeurIPS'24] Official PyTorch implementation for paper "Knowledge Composition using Task Vectors with Learned Anisotropic Scaling"☆22Updated 5 months ago