xverse-ai / XVERSE-MoE-A36B
XVERSE-MoE-A36B: A multilingual large language model developed by XVERSE Technology Inc.
☆37Updated 5 months ago
Alternatives and similar repositories for XVERSE-MoE-A36B:
Users that are interested in XVERSE-MoE-A36B are comparing it to the libraries listed below
- ☆28Updated 6 months ago
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆36Updated last year
- XVERSE-MoE-A4.2B: A multilingual large language model developed by XVERSE Technology Inc.☆36Updated 10 months ago
- ☆60Updated last month
- ☆30Updated last month
- A Framework for Decoupling and Assessing the Capabilities of VLMs☆40Updated 8 months ago
- [NAACL 2025] Representing Rule-based Chatbots with Transformers☆19Updated last month
- ☆20Updated 9 months ago
- imagetokenizer is a python package, helps you encoder visuals and generate visuals token ids from codebook, supports both image and video…☆30Updated 8 months ago
- Official Repository for Paper "BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Compet…☆18Updated 6 months ago
- ☆31Updated 3 months ago
- Official repository for ICML 2024 paper "MoRe Fine-Tuning with 10x Fewer Parameters"☆17Updated 2 weeks ago
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆28Updated 9 months ago
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆128Updated 8 months ago
- Empirical Study Towards Building An Effective Multi-Modal Large Language Model☆23Updated last year
- Delta-CoMe can achieve near loss-less 1-bit compressin which has been accepted by NeurIPS 2024☆54Updated 3 months ago
- SkyScript-100M: 1,000,000,000 Pairs of Scripts and Shooting Scripts for Short Drama: https://arxiv.org/abs/2408.09333v2☆118Updated 3 months ago
- Our 2nd-gen LMM☆32Updated 9 months ago
- A light proxy solution for HuggingFace hub.☆46Updated last year
- Its an open source LLM based on MOE Structure.☆58Updated 8 months ago
- A light-weight and high-efficient training framework for accelerating diffusion tasks.☆46Updated 5 months ago
- Here we will track the latest AI Multimodal Models, including Multimodal Foundation Models, LLM, Agent, Audio, Image, Video, Music and 3D…☆34Updated last month
- ☆31Updated 8 months ago
- Code for paper: "Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines"☆11Updated 4 months ago