codefuse-ai / Collinear-Constrained-Attention
☆58Updated 5 months ago
Related projects ⓘ
Alternatives and complementary repositories for Collinear-Constrained-Attention
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆28Updated 5 months ago
- [ICLR 2024] CLEX: Continuous Length Extrapolation for Large Language Models☆73Updated 8 months ago
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆126Updated 5 months ago
- A prototype repo for hybrid training of pipeline parallel and distributed data parallel with comments on core code snippets. Feel free to…☆49Updated last year
- ☆89Updated 7 months ago
- FuseAI Project☆76Updated 3 months ago
- code for Scaling Laws of RoPE-based Extrapolation☆70Updated last year
- ☆40Updated 5 months ago
- [ICML'24] The official implementation of “Rethinking Optimization and Architecture for Tiny Language Models”☆118Updated 4 months ago
- We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.☆51Updated 3 weeks ago
- Counting-Stars (★)☆76Updated 2 months ago
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning☆125Updated 2 months ago
- ☆88Updated last month
- An Experiment on Dynamic NTK Scaling RoPE☆61Updated 11 months ago
- Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)☆77Updated last month
- ☆79Updated 7 months ago
- Fantastic Data Engineering for Large Language Models☆51Updated 3 months ago
- [ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".☆100Updated 3 weeks ago
- CLongEval: A Chinese Benchmark for Evaluating Long-Context Large Language Models☆38Updated 8 months ago
- ☆78Updated 2 months ago
- Reformatted Alignment☆112Updated 2 months ago
- [ACL 2024] Long-Context Language Modeling with Parallel Encodings☆147Updated 5 months ago
- Source code of "Reasons to Reject? Aligning Language Models with Judgments"☆56Updated 8 months ago
- ☆35Updated 2 months ago
- [NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct☆120Updated 2 weeks ago
- [NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejection☆27Updated 3 weeks ago
- The official repository of the Omni-MATH benchmark.☆52Updated 3 weeks ago
- Code implementation of synthetic continued pretraining☆60Updated last month
- Unofficial implementation of AlpaGasus☆84Updated last year
- A flexible and efficient training framework for large-scale alignment tasks☆209Updated this week