THUDM / SwissArmyTransformer
SwissArmyTransformer is a flexible and powerful library to develop your own Transformer variants.
☆1,012Updated 3 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for SwissArmyTransformer
- [NIPS2023] RRHF & Wombat☆798Updated last year
- Tencent Pre-training framework in PyTorch & Pre-trained Model Zoo☆1,031Updated 3 months ago
- Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"☆851Updated 8 months ago
- Efficient Training (including pre-training and fine-tuning) for Big Models☆564Updated 3 months ago
- An optimized deep prompt tuning strategy comparable to fine-tuning across scales and tasks☆1,985Updated last year
- Collaborative Training of Large Language Models in an Efficient Way☆411Updated 2 months ago
- ☆901Updated last year
- ☆891Updated 5 months ago
- A plug-and-play library for parameter-efficient-tuning (Delta Tuning)☆998Updated 2 months ago
- LOMO: LOw-Memory Optimization☆979Updated 4 months ago
- DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models☆1,008Updated 10 months ago
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆1,338Updated 8 months ago
- Rotary Transformer☆822Updated 2 years ago
- Open Academic Research on Improving LLaMA to SOTA LLM☆1,607Updated last year
- [NeurIPS 2023] Official implementations of "Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models"☆509Updated 9 months ago
- 🩹Editing large language models within 10 seconds⚡☆1,284Updated last year
- [COLM 2024] LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition☆594Updated 3 months ago
- [NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baich…☆874Updated last month
- X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages☆306Updated last year
- [ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning☆558Updated 8 months ago
- Best practice for training LLaMA models in Megatron-LM☆628Updated 10 months ago
- Emu Series: Generative Multimodal Models from BAAI☆1,662Updated last month
- Easy and Efficient Finetuning LLMs. (Supported LLama, LLama2, LLama3, Qwen, Baichuan, GLM , Falcon) 大模型高效量化训练+部署.☆576Updated 4 months ago
- 【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment☆726Updated 7 months ago
- [ACL 2024] LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding☆671Updated 2 months ago
- A family of lightweight multimodal models.☆933Updated this week
- Code for our EMNLP 2023 Paper: "LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models"☆1,079Updated 8 months ago
- Rectified Rotary Position Embeddings☆341Updated 6 months ago
- ⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)☆883Updated 4 months ago
- ☆453Updated 5 months ago