thunlp / Delta-CoMe
Delta-CoMe can achieve near loss-less 1-bit compressin which has been accepted by NeurIPS 2024
☆52Updated last month
Alternatives and similar repositories for Delta-CoMe:
Users that are interested in Delta-CoMe are comparing it to the libraries listed below
- ☆84Updated last month
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆127Updated 6 months ago
- ☆42Updated 6 months ago
- Mixture-of-Experts (MoE) Language Model☆184Updated 4 months ago
- Its an open source LLM based on MOE Structure.☆57Updated 6 months ago
- GLM Series Edge Models☆121Updated last week
- FuseAI Project☆76Updated 3 weeks ago
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆36Updated last year
- Repo for Paper "Unfolding the Headline: Iterative Self-Questioning for News Retrieval and Timeline Summarization"☆38Updated this week
- ☆81Updated 8 months ago
- Valley is a cutting-edge multimodal large model designed to handle a variety of tasks involving text, images, and video data.☆147Updated last week
- SUS-Chat: Instruction tuning done right☆48Updated 11 months ago
- ☆61Updated 6 months ago
- ☆92Updated 9 months ago
- code for Scaling Laws of RoPE-based Extrapolation☆71Updated last year
- LongQLoRA: Extent Context Length of LLMs Efficiently☆162Updated last year
- ☆51Updated 3 months ago
- This is a personal reimplementation of Google's Infini-transformer, utilizing a small 2b model. The project includes both model and train…☆54Updated 8 months ago
- ☆155Updated last month
- zero零训练llm调参☆31Updated last year
- A Toolkit for Running On-device Large Language Models (LLMs) in APP☆58Updated 6 months ago
- A light proxy solution for HuggingFace hub.☆46Updated last year
- ☆219Updated 8 months ago
- ☆78Updated 8 months ago
- Implementation of the LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper☆125Updated 5 months ago
- Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent☆201Updated last month
- Imitate OpenAI with Local Models☆85Updated 4 months ago
- ☆36Updated 2 months ago
- XVERSE-65B: A multilingual large language model developed by XVERSE Technology Inc.☆132Updated 9 months ago
- ☆205Updated 8 months ago