code for Learning the Unlearned: Mitigating Feature Suppression in Contrastive Learning
β20Jul 16, 2024Updated last year
Alternatives and similar repositories for MCL
Users that are interested in MCL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- πΌ Official implementation of Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Expertsβ41Sep 29, 2024Updated last year
- CLIP-MoE: Mixture of Experts for CLIPβ58Oct 10, 2024Updated last year
- The official code repository for the FullFront benchmarkβ27May 16, 2025Updated 10 months ago
- βοΈ [ICCV 2025] Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraintsβ79Jul 10, 2025Updated 9 months ago
- [ICLR2026] Laser: Learn to Reason Efficiently with Adaptive Length-based Reward Shapingβ63May 22, 2025Updated 10 months ago
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Open-Pandora: On-the-fly Control Video Generationβ35Nov 28, 2024Updated last year
- [ACL' 25] The official code repository for PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models.β90Feb 15, 2025Updated last year
- β69Jul 8, 2025Updated 9 months ago
- Code and data for "Timo: Towards Better Temporal Reasoning for Language Models" (COLM 2024)β25Oct 23, 2024Updated last year
- [ICML 2024] Code for the paper "MoE-RBench: Towards Building Reliable Language Models with Sparse Mixture-of-Experts"β10Jul 1, 2024Updated last year
- Test-time preferenece optimization (ICML 2025).β182May 8, 2025Updated 11 months ago
- [ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoningβ71Jul 13, 2025Updated 8 months ago
- [CVPR' 25] Official repo for From Head to Tail: Towards Balanced Representation in Large Vision-Language Models through Adaptive Data Calβ¦β22Jun 6, 2025Updated 10 months ago
- [ICML 2025 Oral] The official repository for the paper "Can MLLMs Reason in Multimodality? EMMA: An Enhanced MultiModal ReAsoning Benchmaβ¦β70Jul 17, 2025Updated 8 months ago
- GPU virtual machines on DigitalOcean Gradient AI β’ AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Adversarial Category Alignment Network for Cross-domain Sentiment Classification (NAACL 2019)β23Jul 4, 2019Updated 6 years ago
- [FSE'2026] SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarksβ169Mar 2, 2026Updated last month
- β129Feb 4, 2026Updated 2 months ago
- Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"β18Mar 15, 2024Updated 2 years ago
- β18Aug 7, 2024Updated last year
- A Theano implementation of a CNN DSEBM (deep structured energy-based model) described in https://arxiv.org/pdf/1605.07717v2.pdfβ10Oct 13, 2016Updated 9 years ago
- [ICLR 2026] Official repo for "FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting"β43Oct 9, 2025Updated 6 months ago
- Code and data for "ConflictBank: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLM" (NeurIPS 2024 Track Datasets andβ¦β67May 16, 2025Updated 10 months ago
- [ACL-IJCNLP 2021] "EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets" by Xiaohan Chen, Yu Cheng, Shuohang Wang, Zhe Gan, β¦β18Dec 30, 2021Updated 4 years ago
- GPU virtual machines on DigitalOcean Gradient AI β’ AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- [TMLR] "Adversarial Feature Augmentation and Normalization for Visual Recognition", Tianlong Chen, Yu Cheng, Zhe Gan, Jianfeng Wang, Lijuβ¦β21Nov 27, 2022Updated 3 years ago
- [CVPR 2022] "The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy" by Tianlong Cβ¦β25Mar 9, 2022Updated 4 years ago
- β18Nov 5, 2016Updated 9 years ago
- π LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Trainingβ93Dec 3, 2024Updated last year
- Official Implementation (Pytorch) of the "Representation Shift: Unifying Token Compression with FlashAttention", ICCV 2025β33Feb 22, 2026Updated last month
- [CVPR2025] Extrapolating and Decoupling Image-to-Video Generation Models: Motion Modeling is Easier Than You Thinkβ23Jul 1, 2025Updated 9 months ago
- β40Jul 20, 2024Updated last year
- A video retrieval dataset How2R and a video QA dataset How2QAβ24Oct 15, 2020Updated 5 years ago
- β13Oct 21, 2021Updated 4 years ago
- DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Simplified implementation for Domain Seperation Networksβ13Feb 11, 2023Updated 3 years ago
- [NeurIPS 2025] Reasoning MLLM, Share-GRPO, advantage vanishing, sparse rewardβ36Sep 19, 2025Updated 6 months ago
- Official Repository of "Learning to Reason under Off-Policy Guidance"β436Mar 20, 2026Updated 3 weeks ago
- Code and data for "Improving Temporal Generalization of Pre-trained Language Models with Lexical Semantic Change" (EMNLP2022)β18Dec 8, 2022Updated 3 years ago
- Official codebase for CuGRO: Continual Offline Reinforcement Learning via Diffusion-based Dual Generative Replayβ33Apr 14, 2024Updated last year
- βοΈ [ICLR 2026] Official code of "Search Arena: Analyzing Search-Augmented LLMs".β55Feb 23, 2026Updated last month
- Domain Adaptive Text Style Transfer, EMNLP 2019β70Oct 15, 2019Updated 6 years ago