OpenBA-V2: 3B LLM (Large Language Model) with T5 architecture, utilizing model pruning technique and continuing pretraining from OpenBA-15B.
☆25May 10, 2024Updated last year
Alternatives and similar repositories for OpenBA-v2
Users that are interested in OpenBA-v2 are comparing it to the libraries listed below
Sorting:
- This is the codebase for pre-training, compressing, extending, and distilling LLMs with Megatron-LM.☆12Mar 11, 2024Updated last year
- CMD: a framework for Context-aware Model self-Detoxification (EMNLP2024 Long Paper)☆17Feb 10, 2025Updated last year
- The framework to prune LLMs to any size and any config.☆95Mar 1, 2024Updated 2 years ago
- Code for paper: Long cOntext aliGnment via efficient preference Optimization☆24Oct 10, 2025Updated 4 months ago
- Diffusion Model Improvement Method☆34Sep 4, 2023Updated 2 years ago
- [ACL 2024 Findings] Code implementation of Paper "Rethinking Negative Instances for Generative Named Entity Recognition"☆60Mar 20, 2024Updated last year
- ☆96Oct 8, 2023Updated 2 years ago
- 🫧 Code for Holistic Reasoning with Long-Context LMs: A Benchmark for Database Operations on Massive Textual Data (Maekawa*, Iso* et al.…☆12Feb 25, 2025Updated last year
- A comprehensive and efficient long-context model evaluation framework☆31Feb 8, 2026Updated 3 weeks ago
- ☆22Oct 22, 2024Updated last year
- ☆27Nov 25, 2025Updated 3 months ago
- [NeurIPS 2024] | An Efficient Recipe for Long Context Extension via Middle-Focused Positional Encoding☆22Oct 10, 2024Updated last year
- ☆22Mar 7, 2025Updated 11 months ago
- Official code and resources for the paper "EXIT: Context-Aware Extractive Compression for Enhancing Retrieval-Augmented Generation."☆22Dec 23, 2024Updated last year
- Suri: Multi-constraint instruction following for long-form text generation (EMNLP’24)☆27Oct 3, 2025Updated 4 months ago
- Code and data for "Timo: Towards Better Temporal Reasoning for Language Models" (COLM 2024)☆25Oct 23, 2024Updated last year
- ☆187Jul 22, 2024Updated last year
- ☆27Feb 18, 2024Updated 2 years ago
- ☆60Jan 12, 2026Updated last month
- Code repository for the public reproduction of the language modelling experiments on "MatFormer: Nested Transformer for Elastic Inference…☆31Nov 14, 2023Updated 2 years ago
- ☆28May 24, 2025Updated 9 months ago
- Code and data for "Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning?" (ACL 2024)☆32Jul 3, 2024Updated last year
- Emergent Hierarchical Reasoning in LLMs/VLMs through Reinforcement Learning☆62Oct 24, 2025Updated 4 months ago
- A trainable user simulator☆34Jun 30, 2025Updated 8 months ago
- SimADFuzz: Simulation-Feedback Fuzz Testing for Autonomous Driving Systems☆10Apr 11, 2025Updated 10 months ago
- Code for the preprint "Cache Me If You Can: How Many KVs Do You Need for Effective Long-Context LMs?"☆48Jul 29, 2025Updated 7 months ago
- LongMIT: Essential Factors in Crafting Effective Long Context Multi-Hop Instruction Datasets☆41Sep 30, 2024Updated last year
- An Abstractive Summarization(for Datasets in English format) Implementation with Transformer and Pointer-generator☆12Dec 31, 2020Updated 5 years ago
- ☆11Aug 20, 2025Updated 6 months ago
- Python3入门机器学习 经典算法与应用 学习☆11Nov 9, 2018Updated 7 years ago
- ☆54May 19, 2025Updated 9 months ago
- Official Implementation of "Learning to Refuse: Towards Mitigating Privacy Risks in LLMs"☆10Dec 13, 2024Updated last year
- This repo is the artifact of FUEL☆13Dec 2, 2025Updated 2 months ago
- Source code for ISSTA'24 paper "AI Coders Are Among Us: Rethinking Programming Language Grammar Towards Efficient Code Generation"☆12Oct 21, 2024Updated last year
- pytorch版损失函数,改写自科学空间文章,【通过互信息思想来缓解类别不平衡问题】、【将“softmax+交叉熵”推广到多标签分类问题】☆12Aug 22, 2021Updated 4 years ago
- This repository contains 4000 vulnerable hardware designs. Currently this is in Jsonl format for directly using it for fine-tuning LLMs. …☆21Mar 25, 2025Updated 11 months ago
- Code for ICLR 2025 Paper "What is Wrong with Perplexity for Long-context Language Modeling?"☆110Oct 11, 2025Updated 4 months ago
- [ICLR 2025] ELICIT: LLM Augmentation Via External In-context Capability☆13Mar 11, 2025Updated 11 months ago
- Semantic Scaffolds for Pseudocode-to-Code Generation (accepted by ACL 2020)☆14Jun 7, 2021Updated 4 years ago