SprocketLab / sparse_matrix_fine_tuning
Official repository for ICML 2024 paper "MoRe Fine-Tuning with 10x Fewer Parameters"
☆16Updated 2 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for sparse_matrix_fine_tuning
- Representing Rule-based Chatbots with Transformers☆18Updated 4 months ago
- Enable Next-sentence Prediction for Large Language Models with Faster Speed, Higher Accuracy and Longer Context☆17Updated 3 months ago
- imagetokenizer is a python package, helps you encoder visuals and generate visuals token ids from codebook, supports both image and video…☆29Updated 5 months ago
- ☆15Updated 3 months ago
- Official implementation of ECCV24 paper: POA☆24Updated 3 months ago
- Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs☆22Updated last month
- PyTorch implementation of StableMask (ICML'24)☆12Updated 4 months ago
- In-Context Alignment: Chat with Vanilla Language Models Before Fine-Tuning☆33Updated last year
- Here we will test various linear attention designs.☆56Updated 6 months ago
- Pytorch implementation of HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models☆28Updated 8 months ago
- Official implementation of the paper "MMInA: Benchmarking Multihop Multimodal Internet Agents"☆38Updated 7 months ago
- ☆30Updated 3 months ago
- DPO, but faster 🚀☆23Updated 3 weeks ago
- A repository for research on medium sized language models.☆74Updated 5 months ago
- ☆21Updated this week
- Implementation of "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"☆42Updated last week
- XVERSE-MoE-A36B: A multilingual large language model developed by XVERSE Technology Inc.☆36Updated 2 months ago
- ☆22Updated 2 weeks ago
- ☆27Updated 5 months ago
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆32Updated last month
- Codebase for Instruction Following without Instruction Tuning☆32Updated last month
- ☆35Updated 9 months ago
- ☆59Updated last month
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆24Updated 7 months ago
- Code for paper "Patch-Level Training for Large Language Models"☆71Updated this week
- [ACL 24 Findings] Implementation of Resonance RoPE and the PosGen synthetic dataset.☆21Updated 8 months ago
- A Framework for Decoupling and Assessing the Capabilities of VLMs☆38Updated 4 months ago
- GoldFinch and other hybrid transformer components☆39Updated 4 months ago
- RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best…☆21Updated 7 months ago
- Code Implementation, Evaluations, Documentation, Links and Resources for Min P paper☆18Updated last week