airaria / GRAIN
GRAIN: Gradient-based Intra-attention Pruning on Pre-trained Language Models
☆17Updated last year
Related projects ⓘ
Alternatives and complementary repositories for GRAIN
- 1.4B sLLM for Chinese and English - HammerLLM🔨☆43Updated 7 months ago
- the newest version of llama3,source code explained line by line using Chinese☆22Updated 6 months ago
- This is a personal reimplementation of Google's Infini-transformer, utilizing a small 2b model. The project includes both model and train…☆52Updated 6 months ago
- aigc evals☆10Updated 11 months ago
- code for Scaling Laws of RoPE-based Extrapolation☆70Updated last year
- Its an open source LLM based on MOE Structure.☆57Updated 4 months ago
- 大语言模型训练和服务调研☆33Updated last year
- Copy the MLP of llama3 8 times as 8 experts , created a router with random initialization,add load balancing loss to construct an 8x8b Mo…☆25Updated 4 months ago
- Imitate OpenAI with Local Models☆85Updated 2 months ago
- 百川Dynamic NTK-ALiBi的代码实现:无需微调即可推理更长文本☆46Updated last year
- Large-scale exact string matching tool☆15Updated last year
- Repo for for paper "AgentRE: An Agent-Based Framework for Navigating Complex Information Landscapes in Relation Extraction".☆50Updated 3 months ago
- ☆78Updated 6 months ago
- The complete training code of the open-source high-performance Llama model, including the full process from pre-training to RLHF.☆70Updated last year
- Source code for ACL 2023 paper Decoder Tuning: Efficient Language Understanding as Decoding☆48Updated last year
- ☆77Updated last month
- The complete training code of the open-source high-performance Llama model, including the full process from pre-training to RLHF.☆60Updated last year
- make LLM easier to use☆58Updated last year
- ☆37Updated 4 months ago
- ☆33Updated 6 months ago
- ☆90Updated 5 months ago
- 通用简单工具项目☆13Updated last month
- CLongEval: A Chinese Benchmark for Evaluating Long-Context Large Language Models☆38Updated 8 months ago
- 本项目旨在对大量文本文件进行快速编码检测和转换以辅助mnbvc语料集项目的数据清洗工作☆55Updated 2 weeks ago
- 基于ChatGLM2-6B进行微调,包括全参数、参数有效性、量化感知训练等,可实现指令微调、多轮对话微调等。☆25Updated last year
- zero零训练llm调参☆30Updated last year
- A more efficient GLM implementation!☆55Updated last year
- LLM+RAG for QA☆19Updated 9 months ago
- XVERSE-MoE-A4.2B: A multilingual large language model developed by XVERSE Technology Inc.☆36Updated 6 months ago