anyscale / long-context-fine-tuning-blogpost
☆18Updated 7 months ago
Related projects: ⓘ
- ☆22Updated 3 months ago
- ☆21Updated 5 months ago
- Improving Text Embedding of Language Models Using Contrastive Fine-tuning☆54Updated last month
- [ACL 2024] RelayAttention for Efficient Large Language Model Serving with Long System Prompts☆29Updated 6 months ago
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆32Updated 8 months ago
- Data preparation code for CrystalCoder 7B LLM☆42Updated 4 months ago
- ☆12Updated 9 months ago
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer☆34Updated 5 months ago
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated 6 months ago
- ☆21Updated last month
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆29Updated 7 months ago
- Implementation of "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"☆42Updated last week
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆67Updated 2 months ago
- ☆38Updated 4 months ago
- A repository for research on medium sized language models.☆71Updated 3 months ago
- Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTO…☆48Updated last week
- Code for NeurIPS LLM Efficiency Challenge☆52Updated 5 months ago
- Official repository for paper "TableBench: A Comprehensive and Complex Benchmark for Table Question Answering"☆27Updated 3 weeks ago
- ☆18Updated this week
- ☆57Updated 3 weeks ago
- ☆27Updated 5 months ago
- Cascade Speculative Drafting☆23Updated 5 months ago
- ☆35Updated last month
- A Retrieval Benchmark for Scientific Literature Search☆53Updated 2 months ago
- A toolkit enhances PyTorch with specialized functions for low-bit quantized neural networks.☆24Updated 2 months ago
- ☆12Updated last year
- Understanding the correlation between different LLM benchmarks☆27Updated 8 months ago
- ☆42Updated 3 weeks ago
- Official Repository for Paper "BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Compet…☆15Updated 3 weeks ago
- Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"☆44Updated 8 months ago