code for Learning the Unlearned: Mitigating Feature Suppression in Contrastive Learning
β20Jul 16, 2024Updated last year
Alternatives and similar repositories for MCL
Users that are interested in MCL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- πΌ Official implementation of Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Expertsβ41Sep 29, 2024Updated last year
- CLIP-MoE: Mixture of Experts for CLIPβ58Oct 10, 2024Updated last year
- The official code repository for the FullFront benchmarkβ27May 16, 2025Updated last year
- βοΈ [ICCV 2025] Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraintsβ79Jul 10, 2025Updated 10 months ago
- [ICLR2026] Laser: Learn to Reason Efficiently with Adaptive Length-based Reward Shapingβ65May 22, 2025Updated 11 months ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Open-Pandora: On-the-fly Control Video Generationβ35Nov 28, 2024Updated last year
- [ACL' 25] The official code repository for PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models.β91Feb 15, 2025Updated last year
- Code and data for "Timo: Towards Better Temporal Reasoning for Language Models" (COLM 2024)β26Oct 23, 2024Updated last year
- Test-time preferenece optimization (ICML 2025).β182May 8, 2025Updated last year
- [ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoningβ74Jul 13, 2025Updated 10 months ago
- [CVPR' 25] Official repo for From Head to Tail: Towards Balanced Representation in Large Vision-Language Models through Adaptive Data Calβ¦β22Jun 6, 2025Updated 11 months ago
- The official implementation of HybridNorm: Towards Stable and Efficient Transformer Training via Hybrid Normalizationβ19Mar 7, 2025Updated last year
- [FSE'2026] SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarksβ173May 12, 2026Updated last week
- β131Feb 4, 2026Updated 3 months ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- β137Jun 6, 2025Updated 11 months ago
- [KernelGYM & Dr. Kernel] A distributed GPU environment and a collection of RL training methods to support RL for Kernel Generations [ICMLβ¦β172Mar 29, 2026Updated last month
- Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"β18Mar 15, 2024Updated 2 years ago
- [ICLR 2026] Official repo for "FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting"β46Oct 9, 2025Updated 7 months ago
- A Theano implementation of a CNN DSEBM (deep structured energy-based model) described in https://arxiv.org/pdf/1605.07717v2.pdfβ10Oct 13, 2016Updated 9 years ago
- Code and data for "ConflictBank: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLM" (NeurIPS 2024 Track Datasets andβ¦β68May 16, 2025Updated last year
- [ACL-IJCNLP 2021] "EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets" by Xiaohan Chen, Yu Cheng, Shuohang Wang, Zhe Gan, β¦β18Dec 30, 2021Updated 4 years ago
- [TMLR] "Adversarial Feature Augmentation and Normalization for Visual Recognition", Tianlong Chen, Yu Cheng, Zhe Gan, Jianfeng Wang, Lijuβ¦β21Nov 27, 2022Updated 3 years ago
- Caffe/Neon prototxt training file for our Neurocomputing2017 work: Fuzzy Quantitative Deep Compression Networkβ11May 30, 2018Updated 7 years ago
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Code for the paper "No Train, all Gain: Self-Supervised Gradients Improve Deep Frozen Representations"β12Oct 31, 2024Updated last year
- Hyperbolic Safety-Aware Vision-Language Models. CVPR 2025β30Apr 8, 2025Updated last year
- β18Nov 5, 2016Updated 9 years ago
- Code and data for "Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning?" (ACL 2024)β32Jul 3, 2024Updated last year
- π LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Trainingβ93Dec 3, 2024Updated last year
- OpenThinkIMG is an end-to-end open-source framework that empowers LVLMs to think with images.β376Jun 1, 2025Updated 11 months ago
- [CVPR2025] Extrapolating and Decoupling Image-to-Video Generation Models: Motion Modeling is Easier Than You Thinkβ24Jul 1, 2025Updated 10 months ago
- A Good Prompt Is Worth Millions of Parameters: Low-resource Prompt-based Learning for Vision-Language Models (ACL 2022)β42May 13, 2022Updated 4 years ago
- β40Jul 20, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI β’ AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Repository for ''Contextualizing MLP-Mixers Spatiotemporally for Urban Data Forecast at Scale''β14Apr 30, 2024Updated 2 years ago
- β16Sep 2, 2023Updated 2 years ago
- A video retrieval dataset How2R and a video QA dataset How2QAβ24Oct 15, 2020Updated 5 years ago
- TKDE'23: A Survey and Experimental Study on Privacy-Preserving Trajectory Data Publishingβ12May 5, 2023Updated 3 years ago
- [ACM MM 2025 π₯π₯ ] MIRA: A first-of-its-kind medical RAG framework that fuses image features and retrieved knowledge with dynamic contexβ¦β23Aug 28, 2025Updated 8 months ago
- Code and data for "Improving Temporal Generalization of Pre-trained Language Models with Lexical Semantic Change" (EMNLP2022)β18Dec 8, 2022Updated 3 years ago
- Official Repository of "Learning to Reason under Off-Policy Guidance"β450Mar 20, 2026Updated 2 months ago