LongQLoRA: Extent Context Length of LLMs Efficiently
☆168Nov 12, 2023Updated 2 years ago
Alternatives and similar repositories for LongQLoRA
Users that are interested in LongQLoRA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)☆2,697Aug 14, 2024Updated last year
- 百川Dynamic NTK-ALiBi的代码实现:无需微调即可推理更长文本☆49Aug 27, 2023Updated 2 years ago
- [ICLR 2024] CLEX: Continuous Length Extrapolation for Large Language Models☆78Mar 12, 2024Updated 2 years ago
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)☆209May 20, 2024Updated last year
- Official repo for EMNLP 2023 paper "Explain-then-Translate: An Analysis on Improving Program Translation with Self-generated Explanations…☆29Dec 5, 2023Updated 2 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Implementation of paper Data Engineering for Scaling Language Models to 128K Context☆490Mar 19, 2024Updated 2 years ago
- Firefly中文LLaMA-2大模型,支持增量预训练Baichuan2、Llama2、Llama、Falcon、Qwen、Baichuan、InternLM、Bloom等大模型☆416Oct 21, 2023Updated 2 years ago
- Code and data for CoachLM, an automatic instruction revision approach LLM instruction tuning.☆60Mar 20, 2024Updated 2 years ago
- ☆36Oct 10, 2024Updated last year
- [ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"☆449Oct 16, 2024Updated last year
- The official implementation for Collaborative Word-based Pre-trained Item Representation for Transferable Recommendation.☆25Jan 30, 2024Updated 2 years ago
- [ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning☆666Jun 1, 2024Updated last year
- ☆29May 4, 2024Updated last year
- This is a personal reimplementation of Google's Infini-transformer, utilizing a small 2b model. The project includes both model and train…☆59Apr 20, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Firefly: 大模型训练工具,支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、…☆6,652Oct 24, 2024Updated last year
- An Experiment on Dynamic NTK Scaling RoPE☆64Nov 26, 2023Updated 2 years ago
- LongBench v2 and LongBench (ACL 25'&24')☆1,122Jan 15, 2025Updated last year
- [EMNLP 2024] LongAlign: A Recipe for Long Context Alignment of LLMs☆260Dec 16, 2024Updated last year
- ☆62Jun 17, 2024Updated last year
- LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transform…☆1,464Nov 7, 2023Updated 2 years ago
- ☆274Oct 31, 2023Updated 2 years ago
- Finetune Llama 3, Mistral & Gemma LLMs 2-5x faster with 80% less memory☆29May 10, 2024Updated last year
- code for Scaling Laws of RoPE-based Extrapolation☆73Oct 16, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Code for this paper "HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts via HyperNetwork"☆33Nov 29, 2023Updated 2 years ago
- Qwen-WisdomVast is a large model trained on 1 million high-quality Chinese multi-turn SFT data, 200,000 English multi-turn SFT data, and …☆18Apr 12, 2024Updated last year
- ☆25Aug 1, 2023Updated 2 years ago
- YaRN: Efficient Context Window Extension of Large Language Models☆1,686Apr 17, 2024Updated last year
- ☆20Oct 12, 2024Updated last year
- Lightweight chat AI platform featuring custom knowledge, open-source LLMs, prompt-engineering, retrieval analysis. Highly customizable. F…☆218Feb 14, 2024Updated 2 years ago
- Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).☆82Mar 18, 2024Updated 2 years ago
- ☆149Apr 16, 2024Updated last year
- ☆25Dec 13, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Backtracing: Retrieving the Cause of the Query, EACL 2024 Long Paper, Findings.☆92Jul 21, 2024Updated last year
- The official codes for "Aurora: Activating chinese chat capability for Mixtral-8x7B sparse Mixture-of-Experts through Instruction-Tuning"☆263May 9, 2024Updated last year
- ☆13Apr 15, 2024Updated last year
- The official repo for "LLoCo: Learning Long Contexts Offline"☆118Jun 15, 2024Updated last year
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆41Oct 11, 2024Updated last year
- Large-scale exact string matching tool☆17Mar 7, 2025Updated last year
- Low-latency Space-time Supersampling for Real-time Rendering☆33Feb 1, 2024Updated 2 years ago