THUDM / SwissArmyTransformer
SwissArmyTransformer is a flexible and powerful library to develop your own Transformer variants.
☆1,077Updated 4 months ago
Alternatives and similar repositories for SwissArmyTransformer:
Users that are interested in SwissArmyTransformer are comparing it to the libraries listed below
- Rotary Transformer☆941Updated 3 years ago
- LOMO: LOw-Memory Optimization☆985Updated 10 months ago
- Code and models for the paper "One Transformer Fits All Distributions in Multi-Modal Diffusion"☆1,415Updated last year
- [NIPS2023] RRHF & Wombat☆807Updated last year
- A plug-and-play library for parameter-efficient-tuning (Delta Tuning)☆1,027Updated 7 months ago
- Open Academic Research on Improving LLaMA to SOTA LLM☆1,621Updated last year
- ☆904Updated last year
- Emu Series: Generative Multimodal Models from BAAI☆1,716Updated 7 months ago
- An optimized deep prompt tuning strategy comparable to fine-tuning across scales and tasks☆2,034Updated last year
- Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"☆863Updated 4 months ago
- Tencent Pre-training framework in PyTorch & Pre-trained Model Zoo☆1,069Updated 9 months ago
- Collaborative Training of Large Language Models in an Efficient Way☆415Updated 8 months ago
- Efficient Training (including pre-training and fine-tuning) for Big Models☆587Updated 2 weeks ago
- Official implementation of TransNormerLLM: A Faster and Better LLM☆243Updated last year
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆1,386Updated last year
- Rectified Rotary Position Embeddings☆367Updated 11 months ago
- Implementation of paper "Towards a Unified View of Parameter-Efficient Transfer Learning" (ICLR 2022)☆527Updated 3 years ago
- We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tunin…☆2,741Updated last year
- mPLUG-Owl: The Powerful Multi-modal Large Language Model Family☆2,468Updated last month
- LaVIT: Empower the Large Language Model to Understand and Generate Visual Content☆578Updated 7 months ago
- Easy and Efficient Finetuning LLMs. (Supported LLama, LLama2, LLama3, Qwen, Baichuan, GLM , Falcon) 大模型高效量化训练+部署.☆602Updated 3 months ago
- A fast MoE impl for PyTorch☆1,713Updated 2 months ago
- AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning (ICLR 2023).☆320Updated last year
- The official repo of Aquila2 series proposed by BAAI, including pretrained & chat large language models.☆442Updated 6 months ago
- [NeurIPS 2023] Official implementations of "Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models"☆520Updated last year
- Tutel MoE: Optimized Mixture-of-Experts Library, Support DeepSeek FP8/FP4