OpenNLG / OpenBA-v2
OpenBA-V2: 3B LLM (Large Language Model) with T5 architecture, utilizing model pruning technique and continuing pretraining from OpenBA-15B.
☆23Updated 6 months ago
Related projects ⓘ
Alternatives and complementary repositories for OpenBA-v2
- L-CITEEVAL: DO LONG-CONTEXT MODELS TRULY LEVERAGE CONTEXT FOR RESPONDING?☆19Updated last month
- [ICLR 2024] CLEX: Continuous Length Extrapolation for Large Language Models☆73Updated 8 months ago
- One Network, Many Masks: Towards More Parameter-Efficient Transfer Learning☆38Updated last year
- Repo for the EMNLP'24 Paper "Dual-Space Knowledge Distillation for Large Language Models".☆37Updated 2 weeks ago
- We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.☆51Updated 3 weeks ago
- ☆51Updated 7 months ago
- BeHonest: Benchmarking Honesty in Large Language Models☆30Updated 3 months ago
- [ACL 2024 (Oral)] A Prospector of Long-Dependency Data for Large Language Models☆53Updated 3 months ago
- The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":☆35Updated 7 months ago
- SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration☆26Updated last month
- Towards Systematic Measurement for Long Text Quality☆28Updated 2 months ago
- Suri: Multi-constraint instruction following for long-form text generation (EMNLP’24)☆17Updated last week
- ☆25Updated last month
- [ACL 2024] Code for the paper "ALaRM: Align Language Models via Hierarchical Rewards Modeling"☆25Updated 7 months ago
- Evaluating the Ripple Effects of Knowledge Editing in Language Models☆50Updated 7 months ago
- Official repository for paper "Weak-to-Strong Extrapolation Expedites Alignment"☆68Updated 5 months ago
- ☆39Updated 7 months ago
- OPT-Tree: Speculative Decoding with Adaptive Draft Tree Structure☆18Updated 3 months ago
- Code & Data for our Paper "Alleviating Hallucinations of Large Language Models through Induced Hallucinations"☆60Updated 8 months ago
- ☆65Updated 6 months ago
- 🧬 RegMix: Data Mixture as Regression for Language Model Pre-training☆88Updated last month
- EMNLP'2023: Explore-Instruct: Enhancing Domain-Specific Instruction Coverage through Active Exploration☆32Updated 8 months ago
- ☆71Updated 10 months ago
- LongMIT: Essential Factors in Crafting Effective Long Context Multi-Hop Instruction Datasets☆34Updated last month
- [ACL 2024] Long-Context Language Modeling with Parallel Encodings☆147Updated 5 months ago
- 🍼 Official implementation of Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts☆34Updated last month
- ☆88Updated last month
- Official Implementation of "Probing Language Models for Pre-training Data Detection"☆17Updated 5 months ago
- Collection of papers for scalable automated alignment.☆73Updated last month
- CMD: a framework for Context-aware Model self-Detoxification (EMNLP2024 Main)☆14Updated last month