suu990901 / LLaMA-MiLe-LossView external linksLinks
Code for a New Loss for Mitigating the Bias of Learning Difficulties in Generative Language Models
☆67Feb 18, 2025Updated 11 months ago
Alternatives and similar repositories for LLaMA-MiLe-Loss
Users that are interested in LLaMA-MiLe-Loss are comparing it to the libraries listed below
Sorting:
- PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following Models Need for Efficient Generation [NeurIPS 2025]☆18Oct 11, 2025Updated 4 months ago
- This is the official implementation of ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting☆24Jul 30, 2024Updated last year
- 这是一个一键让小参数大模型进行角色扮演的项目,从数据构成和训练都包含在这项目中☆25Mar 31, 2024Updated last year
- List of papers about Large Multimodal model☆31May 31, 2025Updated 8 months ago
- Structured TRIZ prompt engineering for LLMs in an open, portable XML format – MIT licensed.☆14Nov 11, 2025Updated 3 months ago
- [ICLR2025] γ -MOD: Mixture-of-Depth Adaptation for Multimodal Large Language Models☆42Oct 28, 2025Updated 3 months ago
- ☆11May 24, 2024Updated last year
- ☆11Oct 25, 2024Updated last year
- ☆87Dec 29, 2023Updated 2 years ago
- ☆10Dec 10, 2023Updated 2 years ago
- 足球比赛预测☆10Mar 9, 2021Updated 4 years ago
- ☆10Feb 17, 2019Updated 6 years ago
- Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization☆81Dec 25, 2025Updated last month
- AAAI2025☆11Apr 18, 2025Updated 9 months ago
- Predicting treatment effects from RCTs (Circulation: CQO 2019).☆10Jun 21, 2022Updated 3 years ago
- Code for paper "Towards Efficient Pareto Set Approximation via Weight-Ensembling Mixture of Experts"☆11Sep 13, 2024Updated last year
- ☆11Aug 17, 2021Updated 4 years ago
- Code for paper "Concrete Subspace Learning based Interference Elimination for Multi-task Model Fusion"☆14Mar 28, 2024Updated last year
- wePoker is a multi-player poker game for Android☆11Mar 20, 2013Updated 12 years ago
- Service for Bert model to Vector. 高效的文本转向量(Text-To-Vector)服务,支持GPU多卡、多worker、多客户端调用,开箱即用。☆12May 24, 2022Updated 3 years ago
- The code implementation of MuScleLoRA (Accepted in ACL 2024)☆10Dec 1, 2024Updated last year
- Causal Effect Inference for Structured Treatments (SIN) (NeurIPS 2021)☆42Apr 26, 2022Updated 3 years ago
- Official PyTorch code for "Vector Quantization Prompting for Continual Learning (NeurIPS2024)".☆10Oct 16, 2024Updated last year
- codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"☆10Dec 30, 2024Updated last year
- 一个基于react16,react-router4,redux4的webapp。主要功能类似于朋友圈,动态编辑,图片上传,图片预览,点赞,评论,用户登录注册,用户日志管理,用户信息管理。服务等是采用express ,数据持久化采用的是mongodb。功能相对来说比较简单,主…☆10Apr 15, 2021Updated 4 years ago
- KnowMAN: Weakly Supervised Multinomial Adversarial Networks☆12Nov 9, 2021Updated 4 years ago
- ☆11Dec 19, 2023Updated 2 years ago
- Github repository for "Internalizing World Models via Self-Play Finetuning for Agentic RL"☆33Nov 1, 2025Updated 3 months ago
- ☆12Aug 5, 2022Updated 3 years ago
- ☆11Jan 6, 2024Updated 2 years ago
- IMPACT: A Large-scale Integrated Multimodal Patent Analysis and Creation Dataset for Design Patents (NeurIPS 2024)☆14Jul 14, 2025Updated 7 months ago
- 中文文本的向量表示方法(Sentence-BERT, CoSENT)的PyTorch简单实现,可以用于文本相似度计算。☆10Mar 27, 2022Updated 3 years ago
- Accelerating GOT-OCRv2 with VLLM☆11Nov 15, 2024Updated last year
- ☆13Jul 8, 2020Updated 5 years ago
- ☆13Jun 25, 2025Updated 7 months ago
- Deep Counterfactual Prediction with Categorical Backward Variables☆12Feb 8, 2023Updated 3 years ago
- A Structured Grammar for Chart Annotation☆15May 8, 2025Updated 9 months ago
- Qualifying Exam Preparing☆16May 7, 2025Updated 9 months ago
- Offical code repository for PromptMix: A Class Boundary Augmentation Method for Large Language Model Distillation, EMNLP 2023☆12Dec 13, 2023Updated 2 years ago